Gateway monitoring and protonvpn with Wireguard

Started by rumshot, Today at 02:16:47 AM

Previous topic - Next topic
Hello everyone,

I am trying to implement automatic failover between two ProtonVPN WireGuard tunnels on OPNsense, but I am hitting what seems to be unreliable behavior with WireGuard gateway groups and dpinger.

Environment:
- Latest OPNsense version
- Two ProtonVPN WireGuard tunnels
- VLAN100 traffic policy-routed through a gateway group
- Primary tunnel: CHPROTON-BKP (wg3)
- Backup tunnel: CHPROTON
- Gateway group configured with Tier1/Tier2 failover
- Both WireGuard tunnels are operational and working correctly individually

Observed behavior:
- Traffic works correctly through the primary WG tunnel
- If I manually disable the primary gateway/interface, failover works and traffic correctly moves to the backup tunnel
- However, if the primary Proton peer becomes unusable or disconnected, the WireGuard interface often still remains "UP"
- dpinger/gateway monitoring still considers the gateway healthy
- Gateway group does not fail over correctly
- Traffic blackholes until I manually disable the dead gateway/interface

Additional observations:
- Proton tunnels reuse the same local tunnel IP (10.2.0.2), which may contribute to routing/state ambiguity
- OPNsense locally-generated traffic (ping/curl from firewall itself) does not seem to follow policy-routing/gateway-group rules, making reliable local health probing difficult
- Handshake age and RX/TX counters are not reliable enough because WireGuard keepalives continue even when real Internet traffic is broken
- Packet captures confirm that route-to rules are applied and packets leave through the expected WG interface, but sessions still fail when the gateway/group logic gets stuck

What I already tested:
- Different monitor IPs
- Gateway groups with trigger levels
- State resets
- Manual route testing
- tcpdump validation on WG interfaces
- Monitoring handshake age
- curl/ping probes
- Rebooting OPNsense
- Manually disabling/enabling gateways/interfaces

At this point I am considering:
1. Keeping only one active Proton tunnel and one cold standby
2. Creating a custom watchdog script using Monit
3. Using WAN failover underneath a single WireGuard tunnel instead of multiple WG tunnels in a gateway group

My questions:
- Has anyone successfully implemented reliable automatic failover between multiple ProtonVPN WireGuard tunnels on OPNsense?
- If yes, how are you detecting tunnel failure reliably?
- Are you using gateway groups, Monit scripts, dpinger, or another method?
- Did anyone find a clean workaround for locally-generated traffic not following policy-routing rules?
- Is this considered a known limitation/bug with WireGuard + gateway groups on OPNsense?

Any feedback or working architectures would be greatly appreciated.

Thanks!

The Gateway failover is dependent on the state of the Gateway.
Thus this means you need to have 2 GWs, one per tunnel and those need to be unique and per GW needs to be router per Tunnel.

TO achieve that, create a GW, bind it to the specific Tunnel Interface, and do not disable the host routes. This will basically enforce the specific GW only to route and to be accessible via the specific tunnel. This means when the Specific Tunnel will go down so will go down as well the GW cause a static route points over it and the GW monitoring & failover will trigger properly.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
N355 - i226-V | AQC113C | 16G | 500G - PROD

PRXMX
N5105 - i226-V | 2x8G | 512G - NODE #1
N100 - i226-V | 16G | 1T - NODE #2

Quote from: rumshot on Today at 02:16:47 AMProton tunnels reuse the same local tunnel IP (10.2.0.2), which may contribute to routing/state ambiguity
What do you mean by that.
And my observations are, that those tunnels stay up almost all the time, only ICMP gets dropped from time to time, nothing else.

Hi Bob ,

All the proton configs have the same tunnel address , the only thing that changes beyond the keys, are the endpoints.



Quote from: Bob.Dig on Today at 10:59:13 AM
Quote from: rumshot on Today at 02:16:47 AMProton tunnels reuse the same local tunnel IP (10.2.0.2), which may contribute to routing/state ambiguity
What do you mean by that.
And my observations are, that those tunnels stay up almost all the time, only ICMP gets dropped from time to time, nothing else.

Thanks for your response .
If the tunnels needs to have different ip addressing , I won't be able to have two connections with proton, once their tunnels always have the same address, except by the public ip of endpoint that changes according to the location .

Pretty sad . I will try with proton and another vpn provider .

Thanks


Quote from: Seimus on Today at 09:53:31 AMThe Gateway failover is dependent on the state of the Gateway.
Thus this means you need to have 2 GWs, one per tunnel and those need to be unique and per GW needs to be router per Tunnel.

TO achieve that, create a GW, bind it to the specific Tunnel Interface, and do not disable the host routes. This will basically enforce the specific GW only to route and to be accessible via the specific tunnel. This means when the Specific Tunnel will go down so will go down as well the GW cause a static route points over it and the GW monitoring & failover will trigger properly.

Regards,
S.

Today at 12:58:04 PM #5 Last Edit: Today at 07:27:56 PM by Bob.Dig
Quote from: rumshot on Today at 12:38:49 PMI won't be able to have two connections with proton, once their tunnels always have the same address
Just change the first "2" to a different number (e.g. 10.3.0.2).

Hi Bob,

I will try it here . I was afraid the tunnel ip was somehow tied to a nat on proton side .
Doesn't hurt to try .

Many thanks


Quote from: Bob.Dig on Today at 12:58:04 PM
Quote from: rumshot on Today at 12:38:49 PMI won't be able to have two connections with proton, once their tunnels always have the same address
Just change the first 2 with a different number (e.g. 10.3.0.2).