Hi,
Preface:
I have persistently connected OVPN client instance configured, to my customer's network.
I want it to be highly available so client config is replicated between HA nodes.
Remote OVPN server allows only one connection per user account and I have only one account there so only one client can be connected at a time (on primary or on secondary node).
I achieved that by configuring client to follow WAN CARP VIP.
By default WAN CARP VIP is present on primary node so OVPN client runs on primary node and is shut down on secondary.
Issue:
That worked perfectly up to 24.7.
In 25.1 it initially seemed to work well too - if CARP VIP moves to standby node then client on primary node shuts down and and spins up on secondary node and so on. So far so good.
Unfortunately something's broken in 25.1:
If VIP is on primary node and I go to HA->Status on primary node and click on "Synchronize and restart services" then HA tries to (re)start OVPN client on secondary node (even though VIP has not moved there).
Weirdly, this does not happen instantly but after a while.
With two OVPN clients (on primary and secondary node) spun up and trying to use the same account and only one connection per account is allowed on remote OVPN server, weird things start to happen then. Clients randomly and periodically connect and disconnect etc. etc.
What fixes that is manually stopping OVPN client on standby node or moving VIP to secondary node and back (by entering and leaving perisstent CARP maintenance mode).
Any idea what is wrong ? Can this be a bug in 25.1 ?
Actually I have to client instances (to two customers) and both behave exactly same way.
I 'resolved' this by reconfiguring OVPN servers to allow multiple connections per VPN user (=enable duplicate-cn option) and resigning from binding OVPN clients to CARP VIP. This way both OVPN clients can stay up on both (primary/secondary HA nodes).
But the issue is certainly there and does not affect OVPN clients only.
Attempt to synchronize HA settings causes ALL services on secondary node to be started, whether they are expected to run or not.
I'm surprised no one noticed it and responded.
This is not how it worked in 24.7 and before.
For example this also affects iperf3 service.
By default iperf service is not started if iperf instance is not enabled/created in Interfaces/Diagnostics/iperf page.
Attempt to synchronize HA config spins up iperf on secondary node, even though it is not running on primary HA node and not configured to run on secondary node.