Hello everyone,
I am trying to implement automatic failover between two ProtonVPN WireGuard tunnels on OPNsense, but I am hitting what seems to be unreliable behavior with WireGuard gateway groups and dpinger.
Environment:
- Latest OPNsense version
- Two ProtonVPN WireGuard tunnels
- VLAN100 traffic policy-routed through a gateway group
- Primary tunnel: CHPROTON-BKP (wg3)
- Backup tunnel: CHPROTON
- Gateway group configured with Tier1/Tier2 failover
- Both WireGuard tunnels are operational and working correctly individually
Observed behavior:
- Traffic works correctly through the primary WG tunnel
- If I manually disable the primary gateway/interface, failover works and traffic correctly moves to the backup tunnel
- However, if the primary Proton peer becomes unusable or disconnected, the WireGuard interface often still remains "UP"
- dpinger/gateway monitoring still considers the gateway healthy
- Gateway group does not fail over correctly
- Traffic blackholes until I manually disable the dead gateway/interface
Additional observations:
- Proton tunnels reuse the same local tunnel IP (10.2.0.2), which may contribute to routing/state ambiguity
- OPNsense locally-generated traffic (ping/curl from firewall itself) does not seem to follow policy-routing/gateway-group rules, making reliable local health probing difficult
- Handshake age and RX/TX counters are not reliable enough because WireGuard keepalives continue even when real Internet traffic is broken
- Packet captures confirm that route-to rules are applied and packets leave through the expected WG interface, but sessions still fail when the gateway/group logic gets stuck
What I already tested:
- Different monitor IPs
- Gateway groups with trigger levels
- State resets
- Manual route testing
- tcpdump validation on WG interfaces
- Monitoring handshake age
- curl/ping probes
- Rebooting OPNsense
- Manually disabling/enabling gateways/interfaces
At this point I am considering:
1. Keeping only one active Proton tunnel and one cold standby
2. Creating a custom watchdog script using Monit
3. Using WAN failover underneath a single WireGuard tunnel instead of multiple WG tunnels in a gateway group
My questions:
- Has anyone successfully implemented reliable automatic failover between multiple ProtonVPN WireGuard tunnels on OPNsense?
- If yes, how are you detecting tunnel failure reliably?
- Are you using gateway groups, Monit scripts, dpinger, or another method?
- Did anyone find a clean workaround for locally-generated traffic not following policy-routing rules?
- Is this considered a known limitation/bug with WireGuard + gateway groups on OPNsense?
Any feedback or working architectures would be greatly appreciated.
Thanks!