Gateway Groups - Load Balancing - Sticky connections per IP or sessions?

Started by cinntech, May 06, 2020, 05:52:43 PM

Previous topic - Next topic
Just switched to OPNsense from Untangled. So far I'm preferring OPNsense but I'm having issues with the dual wan balancing; which I wasn't having with Untangled.

While enabled I have clients (home users) that get pauses after most things they do. For example - a Amazon Firestick will show 'no internet' but connected to wifi after every stream or when going back to the main menu. After a few seconds it starts again. It appears to be switching connection on every session.  Similar issues with Google Home, PCs browsing facebook etc... no internet then internet then no internet.

When Gateway groups are setup as Tier 1 / Tier 1 (issue above is noticed).
When Gateway groups are setup as Tier 1 / Tier 2 (issues above go away).
When Gateway groups are setup as Tier 2 / Tier 1 (issues above go away).
I can force gateway on an IP and I have no issues at all (not using gateway groups).
Gateway Group Trigger is [Member Down]
Gateway monitoring is enabled (no check in [disable gateway monitoring]) and all are showing online.

Firewall - Settings - Advanced - [Use sticky connections] is checked.

I'm not sure what I'm missing here...



It may be a DNS issue as well...

I have OPNSense as the DNS Server using Unbound DNS - all clients point to OPNSense for DNS.

I have a rule to allow DNS as per (https://docs.opnsense.org/manual/how-tos/multiwan.html):
  IPv4 TCP/UDP   *   *   10.10.10.1   53 (DNS)   *   *   Local Route DNS

In [Services - Unbound DNS - General] Outgoing Network Interfaces is currently set to [All(recommended)].

Would this cause issues if DNS lookups went out 1 WAN while the traffic, for a session, went out another?


I tried upping the [Firewall - Settings - Advanced] source tracking timeout to 3000 and same issue... easy to test by browsing facebook - videos start playing when you scroll and if you watch for a second or two then you get the loading screen... Does anyone else get this with load balancing?

I too am facing the exact problem you described. Did you determine the proper settings to solve this?

When I see the problem occurring the Firewall liveview shows traffic denies incoming to one of the wan interfaces which I presume are asymmetric route traffic... looking for a solution...!

I redirect traffic out multiple VPNs in a similar setup.  I tracked it down to two things.   If I have a FW rule that changes the GW to a GW group and that rule uses anything more than 1 ip odd things happen or don't work.

If I want to use FW rules with /24 or /19s and GW groups. I need to also disable sticky connection under FW>Settings>Advanced>Multi-Wan Uncheck Use sticky connections.

are you using 20.7?

I have a different issue but sounds similar... mine is captive portal and multiwan gateway group...
it would seem that for some moment, when the OS switches from between the 2 WANs, a routing issue will occur.


I observe similar problem. Trying to load balance traffic between two VPN tunnels in a gateway group. As long as both tunnels have the same tier set, Internet connection becomes unstable and hardly usable. This disappears as soon as different tiers are defined for each tunnel in a gateway group (failover without load balancing).
Sticky sessions are enabled.

https://forum.opnsense.org/index.php?topic=19977.msg93076#msg93076

based on this thread, it would seem that "sticky connection" should be "off" for multiwan to work better.

it should cause problems with ip sensitive site though which you should for now individualy set a policy connection, for now that is.

as said, no promised but will be discused with developers

based on may own, inconclusive test, i dont encounter issues with IP, maybe site nowadays have better session control...
inconclusive as yet but disabling sticky connections help

I'm OK with just failover (when gateways have different tiers) for now. ISP offers me only 300 mb/s and one Wireguard instance fully covers that, so I can easily route all home traffic through the tunnel without a need for load balancing.


You can also disable shared forwarding (no QoS or Captive
Portal possible) and use sticky with it


Disabling shared forwarding was the fix for me.  I struggled with this for a long time and with that disabled, everything is working perfectly now with both WANs in the same tier.  Thank you for the help.