Multi-WAN not failing gateways back after uplink returns

Started by pjw, May 23, 2024, 09:42:11 AM

Previous topic - Next topic
I have a multi-WAN setup with two ISPs, which has been running for about 2 years on 23.x.  I hadn't noticed this issue after upgrading to 24.x, since my Starlink has been stable.  Recently, it was bouncing around for some cosmic reason.

My setup is two groups, where I split work traffic to one uplink (with failover set to the other uplink), and the rest of the home traffic to the other uplink (with failover set to the other uplink).  Failover seems to work fine, sort of, but failback doesn't.  I either have to reboot the firewall, or I make some change to the Firewall rules, or Gateway config, and then return it to the original config, and Apply.  That seems to reapply the setup.  But something definitely changed between 23.x and 24.x with the automatic failover.

I'm hoping someone has insight why this failback seems to be broken.  If it's a known issue, I couldn't find anything in forums.  Or if it's something I can provide additional info on, let me know.

Don't have a fix for you but I'm having the exact same issue, been using multi-wan for years without too much pain, but now it simply doesn't fail back.  Not sure what changed, but it's been within the last few releases.  This is a biggish issue for me as my backup LTE internet is metered.  I didn't notice it hadn't failed back last week until I'd already used about 10Gigs of data.  If you find a fix please post it, I might try creating a bug report shortly.

Firewall - Settings - Advanced - Skip rules when gateway is down

have you enabled that? If I remember correctly fail back didn't work well before enabling it.

Quote from: schmuessla on May 30, 2024, 12:17:42 PM
Firewall - Settings - Advanced - Skip rules when gateway is down

have you enabled that? If I remember correctly fail back didn't work well before enabling it.

I admit I don't fully understand what this option does.  But my ruleset defines a gateway group not a specific gateway.  And to be clear failover has worked mostly well for years now, it has only been within the last few updates this started happening. 

Now there was an ongoing issue where some connections are overly sticky and existing connections wouldn't fail back.  However this new issue is different, even new connections choose the lower valued gateway still even after the primary is up.  The only fix is to reboot  the secondary gateway or the firewall itself. 

FWIW, with the last update, there was a blurb in there about gateway failover.  It didn't sound completely related, but potentially related.  Since that update, I haven't had an issue with failback from what I can tell.

Quote from: pjw on June 08, 2024, 09:46:27 PM
FWIW, with the last update, there was a blurb in there about gateway failover.  It didn't sound completely related, but potentially related.  Since that update, I haven't had an issue with failback from what I can tell.

Same here. Issue appears to be resolved after the last update.