Dual WAN - combining load balancing and failover

Started by Shoresy, December 21, 2022, 06:46:09 AM

Previous topic - Next topic
New to OPNsense...I followed instructions for setting up Multi-WAN for two WAN connections connected to my residence. First connection is symmetrical 1Gb fiber, the second is cable, which has a much slower upload speed.

I created a Gateway group, per instructions, setting the fiber connection to Tier 1, Cable to Tier 2. I also went into the single WAN config and added weighting, 3:1 fiber:cable. Also created the firewall rule that handles DNS to both gateways, as well as configured monitoring on each WAN per instructions.

I'd like to load balance the two WAN's as well as using them for failover, but can't quite figure out how, as the instructions more or less leave some unanswered questions for that config. If I try to set both WANs to Tier 1 in the Gateway group, I can no longer connect to any websites after applying the config. I have to go back to Tier 1/Tier 2 to get things working. I am using the "sticky connections" in the advanced firewall area, as well as in the Gateway group config, where I chose "round robin w/sticky address."

Perhaps what I'm trying to do is not possible, and I need 3 or more WANs to accomplish a combined failover + load balance config. Or am I missing something within the config to combine load balancing and failover for 2 WANs?

Any suggestions appreciated.



OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable

Maybe I haven't had enough coffee but I'm not sure I'm following.  In a load balanced scenario, the connections are intrinsically, ALSO, fail over by default.  If a connection goes down, the other connection takes the full load.  So you literally just need to follow the load balance guidance.

If you're looking for HA, that's something entirely different for hardware redundancy.
OPNsense 25.7.6 running on:
Dell Optiplex 3050
Intel I5-7600 @ 3.5Ghz (4 Cores)
Intel I350-T4 Nic
8G DDR4
256G SSD

How does load balancing work when the two WAN's have different IP's. If the WAN getting the request is down, wouldn't the IP be dead to the requester? Especially when one WAN is Dynamic and the other Static. I would think load balancing would only apply for outbound traffic.


December 22, 2022, 04:41:59 PM #4 Last Edit: December 22, 2022, 04:45:32 PM by Shoresy
Quote from: axsdenied on December 22, 2022, 04:18:27 PM
Maybe I haven't had enough coffee but I'm not sure I'm following.  In a load balanced scenario, the connections are intrinsically, ALSO, fail over by default.  If a connection goes down, the other connection takes the full load.  So you literally just need to follow the load balance guidance.

If you're looking for HA, that's something entirely different for hardware redundancy.

I was referring to the paragraph at the bottom of this page, "combining load balancing AND failover."

https://docs.opnsense.org/manual/how-tos/multiwan.html#:~:text=To%20combine%20Load%20Balancing%20with,hold%20multiple%20ISPs%2FWAN%20gateways.

QuoteTo combine Load Balancing with Failover you will have 2 or more WAN connections for Balancing purposes and 1 or more for Failover. OPNsense offers 5 tiers (Failover groups) each tier can hold multiple ISPs/WAN gateways.

I may be misinterpreting, but it appears as though it's possible to use a DUAL WAN for failover and load balancing simultaneously? In other words, failover still happens if a link goes down, but while you have TWO links up, might as well load balance across them. Does this make sense?

A $60 router (the TP-Link ER605) can handle a combo load balance + failover configuration. I used that device prior to "upgrading" to a x86 mini-PC, Intel V225 2.5Gbe J4125 CPU w/16GB of RAM w/OPNsense. It wasn't great because it couldn't handle throughput of my 1Gbps symmetrical fiber connection...it struggled to achieve upload beyond 300Mbps on a 1Gbps upstream connection...that's expected with a $60 underpowered router.
OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable

December 22, 2022, 04:48:02 PM #5 Last Edit: December 22, 2022, 04:51:15 PM by axsdenied
Right, it's referring to the tiers.  2 links set to the same tier are load balanced, but also if one goes down, the other can take the full load as well. (not technically a "failover" in the terms this doc is referring to).

If you have tier 1's, you can also fail over to Tier 2 links and so on. (failover)

I have 2 links, but I use it in ONLY a failover scenario.  One link is set to Tier 1 and the other is Tier 2.
OPNsense 25.7.6 running on:
Dell Optiplex 3050
Intel I5-7600 @ 3.5Ghz (4 Cores)
Intel I350-T4 Nic
8G DDR4
256G SSD

Quote from: axsdenied on December 22, 2022, 04:48:02 PM
...

I have 2 links, but I use it in ONLY a failover scenario.  One link is set to Tier 1 and the other is Tier 2.

Have you tried setting both of your WANs to Tier 1? How did it work out?
OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable

December 23, 2022, 04:13:18 PM #7 Last Edit: December 23, 2022, 04:14:57 PM by cesarvog
Quote from: Shoresy on December 21, 2022, 06:46:09 AM
New to OPNsense...I followed instructions for setting up Multi-WAN for two WAN connections connected to my residence. First connection is symmetrical 1Gb fiber, the second is cable, which has a much slower upload speed.

I created a Gateway group, per instructions, setting the fiber connection to Tier 1, Cable to Tier 2. I also went into the single WAN config and added weighting, 3:1 fiber:cable. Also created the firewall rule that handles DNS to both gateways, as well as configured monitoring on each WAN per instructions.

I'd like to load balance the two WAN's as well as using them for failover, but can't quite figure out how, as the instructions more or less leave some unanswered questions for that config. If I try to set both WANs to Tier 1 in the Gateway group, I can no longer connect to any websites after applying the config. I have to go back to Tier 1/Tier 2 to get things working. I am using the "sticky connections" in the advanced firewall area, as well as in the Gateway group config, where I chose "round robin w/sticky address."

Perhaps what I'm trying to do is not possible, and I need 3 or more WANs to accomplish a combined failover + load balance config. Or am I missing something within the config to combine load balancing and failover for 2 WANs?

Any suggestions appreciated.

I also have two WANs and can confirm that setting both to Tier1 result in being unable to access anything in the Internet. I've followed the same instructions mentioned in the OP. Setting one WAN connection to Tier1 and the other to Tier2 works as expected.

December 24, 2022, 12:29:22 AM #8 Last Edit: December 24, 2022, 12:38:14 AM by Shoresy
Thanks for confirming - I've noticed the same exact issue when both Gateways are set to Tier 1. Basically no access. I don't think that's the expected behavior.

The behavior I would expect, is when both Gateways are Tier 1, they load balance, and if one WAN is weighted heavier than the other, the WAN with the most weight should get the most traffic routed inbound/outbound. If any of the WANs go down, then OPNsense should direct all traffic through the WAN that is online and functional, following a typical failover scenario. Unless I'm missing something, such as an extra firewall rule or something of that sort, I can't for the life of me figure out why internet access doesn't work when both Gateways are Tier 1, yet works fine when they're set to Tier 1 and Tier 2. The problem with the Tier 1/Tier 2 config is that Tier 2 gets no traffic when both Gateways are up.

I have tested failover, which seems to work fine...I unplugged my primary WAN link (Tier 1), and within a few seconds everything went to WAN 2 (Tier 2). When I connected WAN 1, traffic routed back over to the Tier 1 WAN as it should. I just read a new post however that failover might not be working properly in v 22.7.10_2, but have not tested it recently to confirm.
OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable

To get any further, I think we'd need to dig deeper into the "loss" of internet.  What steps are you using to verify this?
OPNsense 25.7.6 running on:
Dell Optiplex 3050
Intel I5-7600 @ 3.5Ghz (4 Cores)
Intel I350-T4 Nic
8G DDR4
256G SSD

December 24, 2022, 09:22:51 PM #10 Last Edit: December 24, 2022, 10:25:48 PM by Shoresy
Quote from: axsdenied on December 24, 2022, 03:51:06 PM
To get any further, I think we'd need to dig deeper into the "loss" of internet.  What steps are you using to verify this?

As soon as I flip both WAN's to Tier 1, clients can't get out to the Internet...pings fail, packet loss, complete failure to connect outside the LAN.

DNS fails to resolve
pings fail
substantial packet loss
traceroutes fail

Load balancing with both WANs set to Tier 1 is a complete disaster. I have sticky addresses enabled, etc. No errors in any of the OPNsense logs that point out what might be happening here. This is a function that works perfectly in a $60 TP-Link router.
OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable

Same symptoms here whenever I try to set both WAN connections with same Tier.

Question: Do you have AdGuard Home plugin set on your opnSense router? (I do). If your response is positive, I wonder if this could be causing the issue...

December 24, 2022, 11:50:56 PM #12 Last Edit: December 25, 2022, 12:34:39 AM by Shoresy
Quote from: cesarvog on December 24, 2022, 11:39:15 PM
Same symptoms here whenever I try to set both WAN connections with same Tier.

Question: Do you have AdGuard Home plugin set on your opnSense router? (I do). If your response is positive, I wonder if this could be causing the issue...

I don't use Adguard as a plugin on OPNsense, but am using Pi-hole to block ads, and Pi-hole is set to my internal DNS server.

I found some older threads regarding this same load balancing issue. Seems that several others have encountered similar problems, and the answer seems to be to disable shared forwarding when using Sticky Connections. Per OPNsense documentation, Sticky Connections prevents the issues that can crop up when balancing between multiple WANs, so shared forwarding has to go.

Firewall > Settings > Advanced

** Uncheck shared forwarding

Save then apply.

https://forum.opnsense.org/index.php?topic=17449.0

https://forum.opnsense.org/index.php?topic=17116.msg93965#msg93965

I have done this with my OPNsense config and balancing now appears to work with both WAN gateways set to Tier 1. I'm admittedly not in full understanding of what shared forwarding actually does, but disabling it allows clients to access the "Googles" and the "Interwebs" when it's disabled/unchecked, with the Gateways set to the SAME TIER. I hope this helps others...it's a struggle to get load balancing working properly otherwise.
OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable

December 25, 2022, 12:00:55 AM #13 Last Edit: December 25, 2022, 02:15:20 PM by cesarvog
Thanks, will try setting as you suggestion.
Marry Christmas!

EDIT: Yes, it works. Thanks once again.

This one is resolved...after a few days of running OPNsense with Shared Forwarding disabled in Firewall > Settings > Advanced, load balancing with both WANs set to Tier 1 is working as expected.
OPNsense 25.1.x-amd64
Intel(R) Celeron(R) N5105CPU @ 2.00GHz
Intel I226-V 2.5Gbe ports x6
16GB DDR4 RAM
256GB NVMe SSD
Dual WAN 1Gb symmetrical Fiber + 1Gb Cable