Dual gateway | Reply-to?

Started by CMogen, March 06, 2025, 11:58:33 PM

Previous topic - Next topic
Hi, I am in the process of migrating things from a Firebox to OPN. Both firewalls host SSLVPN, and share a subnet with the LAN.

      OLD                         New
    ┌────────────────────┐     ┌────────────────────┐
    │ext   7.7.7.25 /26  │     │     7.7.7.33 /26   │
    │                    │     │                    │
    │tun  10.9.9.1 /24   │     │    10.8.8.1 /24    │
    │                    │     │                    │
    │int  10.10.10.1 /24 │     │    10.10.10.2 /24  │
    └────────────────────┘     └────────────────────┘ 

Things work fine if the target LAN machine's gateway matches the VPN entry point, otherwise the return traffic gets sent to the other firewall. The eventual plan is to change all the LAN to the new gateway, but I was hoping to figure out how to get this to work in parallel, if only to better understand the *sense inner workings and terminology.

I have tried disabling the reply-to WAN option, created an interface for OpenVPN, tried some different NAT options there, even created a static route on a LAN machine for the New tunnel network > New gateway, but I don't see that traffic coming back, so now I'm lost and not entirely sure what OPNsense is doing behind the scenes.

Is reply-to the right setting for this, or one of the NAT reflection WAN general rules?
What IP should the LAN nodes see coming from OpenVPN- the tunnel network IP, default WAN gateway, some other NAT? Does this change with NAT option x?
Do I need a outbound NAT rewrite to make the VPN traffic source look like New gateway perhaps?

If someone could break down the NAT automatic outbound/1:1/reflection/reply-to terminology and use cases I would really appreciate it. I've re-read the documentation on each several times and still can't get my head around what they actually do or when/where that would be applied, especially with ovpn in the mix. TIA


Your description is a bit vague. You say "things work fine", with no description of "things". I'm going to assume you have VPN clients connecting to these firewalls and they're trying to access some services running on LAN hosts?

So a client on "New" would get an IP address like 10.8.8.100, and might be trying to access some service on, say, 10.10.10.100. Because the default gateway on that LAN host is 10.10.10.1, the return traffic (destined for IP address 10.8.8.100) gets sent to "OLD" instead of "New". A static route for 10.8.8.0/24 pointing to 10.10.10.2 should solve that. Maybe look at that again. If you're sure you've configured it correctly, try a packet capture on the LAN interface of OPNsense to see what's happening (or not happening).

The options you're wondering about pertain primarily to the WAN side of things, but (if my assumptions/interpretation are correct), your problem is on the LAN side.

Quote from: dseven on March 07, 2025, 09:49:34 AMI'm going to assume you have VPN clients connecting to these firewalls and they're trying to access some services running on LAN hosts?
Correct.

I believe most of the problem is coming from UDP used for VPN encapsulation. No matter what options I set on the New/OPN firewall, the return traffic always seems to follow the OS's default gateway, even skipping the
Quote from: dseven on March 07, 2025, 09:49:34 AMstatic route for 10.8.8.0/24 pointing to 10.10.10.2 should solve that
OS route entry.

I ended up restoring a backup from a couple days ago because I was so deep in the weeds with adding gateways, reply-tos, NAT, I wasn't sure what all was changed. After restoring the backup and creating a new TCP-based tunnel, it's still not working out of the box, but I assume I need to (re)set a gateway and reply-to for the VPN traffic?

Would this be a situation where I should have a LAN gateway (marked upstream)? Most of the docs advise against that unless absolutely necessary, but I'm not sure how else to bind the VPN to an interface and/or specify the reply-to as the LAN interface IP.

What do you mean by "OS route entry"? The static route route need to be added *on the LAN host*.

What makes you think that the problem has anything to do with VPN encapsulation? If that was the case, why would it work when the default gateway of the LAN host points to the firewall handling the VPN?

Quote from: dseven on March 08, 2025, 10:16:01 AMWhat do you mean by "OS route entry"? The static route route need to be added *on the LAN host*.
It was, sorry if it wasn't clear. I said OS meaning Win/Linux route table vs something set in OPNsense.

Quote from: dseven on March 08, 2025, 10:16:01 AMWhat makes you think that the problem has anything to do with VPN encapsulation? If that was the case, why would it work when the default gateway of the LAN host points to the firewall handling the VPN?

Documentation I guess. I assume the same would apply in OPNsense. From https://docs.netgate.com/pfsense/en/latest/vpn/openvpn/multi-wan.html

QuoteThe protocol choice for UDP on IPv4 and IPv6 on all interfaces (multihome) will work properly on all WANs and respond back using the address clients expect.

These other UDP modes in OpenVPN are limited by the connectionless nature of UDP. In these cases, the OpenVPN instance replies back to the client, but the Operating System selects the route and source address based on what the routing table believes is the best path to reach the peer. For non-default WANs, that will not be the correct path or the address the peer used when contacting this VPN.

You seem to be fixated on the idea that this is an OPNsense multi-WAN issue, but you don't have multiple WAN interfaces on your OPNsense firewall (or if you do, you have not described that). The documentation you quote pertains to the VPN protocol traffic routing back to the VPN client though the appropriate WAN interface (on OPNsense). That's not your issue (or not what you've described).

The static route on the LAN host should work. If it doesn't, I'd be doing packet captures to see what's going on.

Another possibility might be to add a static route for the VPN client subnet on the old firewall, pointing to the new firewall's LAN address. The old firewall might send an ICMP redirect, and the LAN host might follow it. This would have the advantage of only one place to add the route, rather than ever LAN host, but there are a few "mights" that have to happen for this to work.

Well, it is a multi-WAN setup (from the perspective of a LAN host), they're just not on the same device, and I think that's most of the problem. If both LAN gateways were on the OPNsense, I'm sure conntrack (or whatever BSD/OPN uses) would auto-magically fix it. The UDP scenario could apply to my situation, and seems like that's exactly what's happening. It's either getting the external IP as a reply to, or it's a weird stateless multi-home UDP thing like this

You're right though, I guess I need to break out Wireshark and try to figure out what's going on. I was hoping someone might point me to some additional docs/workshop/utils, to better explain the flow and when/how those various NAT/reflection/reply options are set. I was trying to avoid manually figuring that out via trial and error and a packet sniffer.

Especially when I've had configs get "stuck" on a few occasions. Like deleting a gateway or interface, and seeing that if's rules still being applied to traffic somehow when I'm testing option F 20 minutes later. Without rebooting between every change, I'm now questioning if what I see in the GUI matches the running config. It's tedious already, and I haven't alt-tabbed to a packet sniffer between every change yet.


Quote from: CMogen on March 09, 2025, 08:06:48 PMWell, it is a multi-WAN setup (from the perspective of a LAN host), they're just not on the same device, and I think that's most of the problem.
You have an issue with VPN traffic. This has nothing to do with the WANs.

I don't expect, that a simple static route on the old router would work. I assume, it would end up in asymmetric routing issues and OPNsense might block the forwarded packets.
It would be necessary to separate the non-default gateway from the LAN completely and configure a transit network between the firewalls.

If only need to allow access from the VPN client to your LAN and don't care about the clients source IP, you can simply masquerade the traffic with the LAN IP of OPNsense.
This can be done with an outbound NAT rule like this:
interface: LAN
source: VPN tunnel subnet
destination: LAN network
translation: interface IP (LAN)

The outbound NAT must be set to hybrid mode (or manual) to enable this rule.

Quote from: viragomann on March 09, 2025, 09:17:32 PMIf only need to allow access from the VPN client to your LAN and don't care about the clients source IP, you can simply masquerade the traffic with the LAN IP of OPNsense.
Thanks for that, I'm pretty sure I tried it at some point, but I'm not sure if maybe another (floating?) rule was getting to the traffic first. I plan on deleting all the auto-generated rules and working up and I'll try that again.

Just for the sake of beating horses, and understanding why I'm fundamentally wrong on this, but I still see it as a multi-WAN scenario. Let's say the 7.7.7s are actually two different ISPs, and take the old firewall out, it would look like this all on a single OPNsense:

    ┌────────────────────┐    ┌────────────────────┐
ext │ if1 7.7.7.25 /26   │    │if2  3.3.3.33 /26   │
    │                    │    │                    │
tun │ vpn1 10.9.9.1 /24  │    │vpn2 10.8.8.1 /24   │ two different ovpn instances
    │                    │    │                    │
int │ if3 10.10.10.1 /24 │    │if4  10.10.10.2 /24 │
    └────────────────────┘    └────────────────────┘ 
LANserverA = defgw 10.10.10.1
LANserverB = defgw 10.10.10.2

Without some magical intervention (conntrack, auto proxyarp, noclue) by being on the same device, would I not have the same problem connecting to serverA via vpn2/3.3.3.3 or serverB via vpn1/7.7.7.25?

Quote from: CMogen on March 12, 2025, 12:22:56 AMJust for the sake of beating horses, and understanding why I'm fundamentally wrong on this, but I still see it as a multi-WAN scenario.

It is from the perspective of the LAN server, but it is not from OPNsense's perspective. The "reply-to" mitigation in OPNsense only comes into effect when **OPNsense itself** has multiple WAN interfaces. You would need some sort of equivalent on the LAN server - it would need to track sessions, and send return packets for sessions from your VPN back to the MAC address of OPNseense's LAN interface instead of that of its default gateway.

I still say that a static route on the LAN servers should work. The NAT hack should work too, but only in the inbound direction (LAN server would not be able to connect out to VPN addresses).

Quote from: CMogen on March 12, 2025, 12:22:56 AMLet's say the 7.7.7s are actually two different ISPs, and take the old firewall out, it would look like this all on a single OPNsense:

If you took the old router out and only OPNsense remained, it would have **only one** LAN ("int") address, and that would be the default gateway for (all of) your LAN servers. Presumably you would use 10.10.10.1 for that and forget 10.10.10.2.

Quote from: dseven on March 12, 2025, 09:44:17 AMIf you took the old router out and only OPNsense remained, it would have **only one** LAN ("int") address, and that would be the default gateway for (all of) your LAN servers. Presumably you would use 10.10.10.1 for that and forget 10.10.10.2.
Fair enough.

For the NAT hack, if you could clarify a few things I think I can run with it.
(I do think the static route on LAN servers would work too, but maybe only with TCP, so I'll start with NAT)

1. Should I manually bind an interface to the vpn device? Docs say this may be required for reply-to and other NAT functions to work properly.
a. Set IP on if, or use whatever passthrough settings from ovpn?
b. Dynamic gateway checked?

2. Outgoing NAT masq- To confirm, this should be the LAN IP, not VPN tunnel IP? (I realize tunnel would require a static route in endpoint OS to work, but LAN I question if ovpn will automatically see/pick that up as return traffic)
a. If LAN: Outbound NAT policy on 1OpenVPN default created if, or on 2manually created interface (if created in step1)?
b. ^ policy: source= Tunnel network > dest= LAN net > translation target= LAN address? [or manually specify 10.10.10.1]

3. Troubleshooting- if the NAT masq works, packets come back to LAN IP, but ovpn doesn't seem to be picking them up.. where/how could I see the logs for discarded packet, no route, etc to find the breakdown?

4. Any general options that should/shouldn't be enabled. I am removing all auto-added pass/deny policies and switching to manual outbound NAT only, but may still have some non-default setting leftover from testing.

I truly appreciate any help, if I can get this working I think I'll be sold and join team OPNsense. Then I'll be happy to support the cause as well; I'm actually not another random FOSS user that expects on-site support and a guided tour, but it is all (frustratingly) new to me and the documentation can be vague at times.



There's nothing TCP-specific about a static route - it would apply for all IP protocols.

The outbound NAT rule would be created on the LAN interface, and the Translation/target would be "Interface address" (should be 10.10.10.2 in your case??). The reason this should work is that the LAN server will deliver to any address in 10.10.10.0/24 locally, where anything else would sent to its default gateway (in lieu of a more specific route, such as that static one that I still say should work!)

For troubleshooting, I'd usually start with packet captures, and only start digging in logs if it appears that something is getting blocked.

BTW, if you switch to Manual outbound NAT, you may break internet access from your LAN hosts. You may want to use Hybrid mode to add your custom hack rule.