19.7.4: After HA failover mobile IPsec users stay connected to the now backup FW

Started by rainerle, September 16, 2019, 10:35:09 AM

Previous topic - Next topic
Hi,

after a few failovers I realised that a via IPsec connected users (with MOBIKE enabled) stays connected to the firewall he initially connected to. The connection itself becomes unusable since the routing is not working anymore, but the VPN client does not reconnect to the new master HA device.

If one of the firewall reboots the client reconnects properly. Above scenario is only valid if the failover is triggered by disabling the CARP interfaces in /carp_status.php .

Is there already an existing best practise/work around for this? From my point of view a CARP failover should trigger service restarts on the old master HA device...

Can I somehow use
/usr/local/opnsense/service/configd_ctl.py
to restart strongswan as soon as an active master becomes backup?


root@opnsense01:~ # cat /usr/local/etc/devd/carp.conf
#
# CARP notify hooks. This will call carpup/carpdown with the
# interface (carp0, carp1) as the first parameter.
#

notify 101 {
    match "system"          "CARP";
    match "subsystem"       "[0-9]+@[0-9a-z]+";
    match "type"            "(MASTER|BACKUP)";
    action "/usr/local/opnsense/service/configd_ctl.py interface carp $subsystem $type";
};
root@opnsense01:~ #


Yes, it does. It uses a DNS name that points to the CARP IP.

I have no test cluster available yet, but from all the HA woes I think I have to build one...

Reading up on
https://wiki.strongswan.org/projects/strongswan/wiki/MobIke
https://tools.ietf.org/html/rfc4555


  • The mobile client initiates the connection using the CARP IP
  • The OPNsense has the interface IP and the CARP IP and let's the client know about both
  • The client decides which one to use
  • As the failover occurs the client switches to the still available interface address

So either disable MobIke or restart the IPsec service when a manual failover occurs that leaves the previous master HA device alive.

MobIke is fantastic for roaming mobile clients, so I would prefer the second option.

I remember a discussion in a github issue about restart daemon after failover, I think you were also involved.

Maybe worth open a feature request, but no guarantee.

rc.syshook "carp" event with "pluginctl -s strongswan restart" should do the trick.


Cheers,
Franco


Quote from: franco on September 16, 2019, 03:33:48 PM
rc.syshook "carp" event with "pluginctl -s strongswan restart" should do the trick.

So apparently there is a bug:
- created a script 20-ipsec as a copy from the 50-frr script
- added logging output as the stop/start command was never executed
- realised that /usr/local/etc/devd/carp.conf never picks up on my CARP interfaces since they are all on lagg devices...

Created a PR https://github.com/opnsense/core/pull/3721