Can't get NAT before IPsec to work

Started by Colani1200, April 12, 2021, 03:00:15 PM

Previous topic - Next topic
April 12, 2021, 03:00:15 PM Last Edit: April 15, 2021, 11:17:14 AM by Colani1200
Hi all,

I have to migrate a VPN tunnel from another gateway to OPNsense which relies on NAT before IPsec. The tunnel looks like this:


    LAN                                                                                              Customer site
---------------         NAT             -----------------------      IPsec            -----------------
|Network A  |  -----------------> |IP from Network B| -----------------> |  Network C  |
---------------                           -----------------------                          -----------------

This is a pretty common scenario in corporate environments. From what I've read, there were problems doing this with OPNsense in the past, but it should be possible with the current version 21.1.4 which I'm running.

In phase2, I have defined B as local network and C as remote network and I added network A as a manual SPD entry. The tunnel comes up fine, but my outbound NAT rule refuses to work when I bind it to the IPsec interface. When I bind the very same NAT rule to the WAN interface, traffic gets NATed as expected, but apparently it does not enter the tunnel.

Any idea what I am missing?

I have quite a lot of similar tunnels in production use and they work fine with OPNsense, not only since 21.1.4 :)

Your outbound NAT rule should look like this:


IPsec any * NETWORK_C  * VIRTUAL_IP_IN_NETWORK_B  * NO


In Phase1 definition "Install policy" must be ticked, otherwise the manual SPD entry wont work.
Make sure you have a firewall-rule allowing desired traffic to NETWORK_C for the IPSEC Interface.

Thanks for taking the time to look at it, goodomens42. My outboud NAT rule looked like this:

IPsec Network_A * NETWORK_C  * VIRTUAL_IP_IN_NETWORK_B  * NO


I replaced "Network_A" with "any" as suggested, but it didn't help.

"Install policy" in phase1 is checked, I verified that.

I think a firewall rule on the IPsec interface should not be neccessary because that is covered by an autogenerated rule (screenshot attached). Plus I don't see any relevant traffic being blocked in the log. Nevertheless I created the rule as you suggested, but no success.

Any other ideas?

Quote
I replaced "Network_A" with "any" as suggested, but it didn't help.

Shouldn't really matter, "NETWORK_A" ist just fine and maybe more logical than "any"

Quote
"Install policy" in phase1 is checked, I verified that.

Did you also check if there are any other routes pointing to NETWORK_C ?
They might take precedence over policy based routing, dont really know this for sure.
It might also be an idea, to turn off automatic addition of routes under "VPN -> IPSEC -> Advanced Settings", this will enforce policy based routing. (I always do this, because it does not work with mixed IPv4/IPv6 tunnels anyhow)

Quote
I think a firewall rule on the IPsec interface should not be neccessary because that is covered by an autogenerated rule (screenshot attached). Plus I don't see any relevant traffic being blocked in the log. Nevertheless I created the rule as you suggested, but no success.

In my setups I usually turn of IPSEC automatic rules, hence the hint.
I think you are right, though, the automatic rule is outbound only, but should cover return packets as the firewall is stateful.

April 14, 2021, 09:26:03 AM #4 Last Edit: April 14, 2021, 09:33:30 AM by Colani1200
Quote from: goodomens42 on April 13, 2021, 09:07:08 PM

Did you also check if there are any other routes pointing to NETWORK_C ?

I did, there aren't any.

Quote
It might also be an idea, to turn off automatic addition of routes under "VPN -> IPSEC -> Advanced Settings", this will enforce policy based routing.

Tried that right now, it didn't help. Still the only entry I have in the firewall log is an incoming allow on the LAN interface from (un-NATed) Network_A to Network_C.

Like I said in my first post, NAT does work when I just change the interface from IPsec to WAN in my NAT rule. To me this looks like the manual SPD entry doesn't get evaluated and traffic is not entering the tunnel.

You could check for the manual SPD entry unter "VPN -> IPSEC -> Security Policy Database".
There should be an entry


NETWORK_A NETWORK_C <- ESP WANIP_OF_YOUR_OPNSENSE -> WANIP_OF_REMOTE_ENDPOINT


If there isn't, something went wrong.
The SPD entry must also have the correct network mask, did you check this ?

April 14, 2021, 12:58:04 PM #6 Last Edit: April 14, 2021, 01:00:39 PM by Colani1200
Now this is interesting. The SPD entry is there, but the tunnel endpoint IP is totally wrong. Really no idea where that IP is coming from, this is the only tunnel currently configured on the OPNsense.

The peer ist behind NAT-T, could that cause confusion somewhere?

April 14, 2021, 02:27:47 PM #7 Last Edit: April 14, 2021, 02:31:50 PM by goodomens42
Quote
The peer ist behind NAT-T, could that cause confusion somewhere?

Maybe.

1. Have you tried if the tunnel actually works ?
You can do so by temporarily assigning VIRTUAL_IP_IN_NETWORK_B as virtual IP to the OPNsense LAN interface and then doing a


ping -S VIRTUAL_IP_IN_NETWORK_B SOME_REACHABLE_IP_IN_NETWORK_C


on the OPNsense.

2. Have you tested reaching NETWORK_C from a client in NETWORK_A or only from the OPNsense itself ?
Traffic originating from the OPNsense usually is not included in the SPD entry, as it will use WAN as outgoing interface, which might explain, why you see masqueraded packets once you move the outgoing NAT rule to the WAN interface.

That ping works. I did some more tests in that direction and I need to clarify things. To be honest, my setup is a bit more complex than I first described. I tried to simplify a bit because I thought it was not relevant but it seemingly is. In fact, my setup looks like this:


   Client LAN                    OPNsense                                                                                   Customer site
---------------   Firewall   --------------        NAT             -----------------------      IPsec            -----------------
|Network A  | ----------> |Network D| -----------------> |IP from Network B| -----------------> |  Network C  |
---------------                 ---------------                         -----------------------                          -----------------


As you can see, the OPNsense does not directly reside in the client LAN (Network A), but in the DMZ of another firewall (network D).

What I tried now: I added network D as a manual SPD in phase 2 and added a firewall rule accordingly. I can ping network C from network D without problems and NAT is working properly. I also checked the tunnel endpoint IP in the Security database and it is correct for Network D. Only the entry for network A has this mysterious tunnel endpoint IP.

To sum this up: It looks like the OPNsense has a problem with a manual SPD entry when that network is not directly connected to it.

Any ideas?

Fine, so the NAT before IPSEC seems to work as intended  :)

First thought:
Is it possible your firewall maquerades, when forwarding to the OPNSense ?
In this case the manual SPD entry should be the masqueraded IP used by the firewall, not NETWORK_A.

Quote
I also checked the tunnel endpoint IP in the Security database and it is correct for Network D.
Only the entry for network A has this mysterious tunnel endpoint IP.

Just for me to understand: The endpoint IP is correct when entering NETWORK_D as SPD and "mysterious" when entering NETWORK_A or did you enter both and are getting different endpoints ?


April 14, 2021, 07:00:45 PM #10 Last Edit: April 14, 2021, 07:02:31 PM by Colani1200
Quote from: goodomens42 on April 14, 2021, 06:30:41 PM

First thought:
Is it possible your firewall maquerades, when forwarding to the OPNSense ?

No it doesn't, just simple routing. Masquerading to an IP from NETWORK_D would probably even make it work because then the source is in a network that is directly attached to the OPNsense. But this is a rather nasty workaround. I'd rather like to understand what's going on, otherwise this behaviour might become a showstopper in a new scenario.

Quote
Just for me to understand: The endpoint IP is correct when entering NETWORK_D as SPD and "mysterious" when entering NETWORK_A or did you enter both and are getting different endpoints ?
I added both NETWORK_D and NETWORK_A comma separated at the same time. Both get different endpoints in the SPD database, the one with NETWORK_D is perfectly fine while the one with NETWORK_A is something crazy. Maybe you can reproduce this on one of your installations? Try adding a fictional, not directly attached network as manual SPD entry to an existing tunnel and check the related endpoint IP in the SPD database...

Well, I actually have a similar setup with branch offices (your NETWORK_A) connecting to a central OPNSense (your NETWORK_D) via WireGuard and from there to a customer network (your NETWORK_C) with NAT before IPSEC using a single tunnel IP (your VIRTUAL_IP_IN_NETWORK_B).
Works like a charm.

The only difference is, that in my case central OPNsense and branch offices share one IPv4 range 10.x.0.0/16 with branch offices using 10.x.[11-..].0/24 and the central OPNsense using 10.x.1.0/24, so I only have one SPD entry 10.x.0.0/16 in phase2.

When adding a second fictional SPD entry (I tried 192.168.1.1/24) I'm getting the same endpoint configuration as for 10.x.0.0/16 as I should.

So the problem might be the "mysterious" endpoint you are getting for the NETWORK_A SPD entry.
Before thinking about this: Have you tried restarting the IPSEC service ?

April 14, 2021, 10:59:42 PM #12 Last Edit: April 14, 2021, 11:01:50 PM by Colani1200
Thanks for testing. Somehow my OPNsense was in a really messed up state.

- Restarted IPsec service: SPD entries still there.
- Stopped IPsec service: SPD entries still there.
- Deleted the whole IPsec tunnel and restarted the service again: SPD entries still there. What?  :o
- Rebooted:  SPD entries gone.
- Recreated the IPsec tunnel from scratch: Everything correct and working as expected!

Not sure why it was in this state. OK, I did do a lot of testing and configuration changes, the peer was configured with a DNS entry before (DynDNS), that mysterious IP address might have been a leftover from that. But this should not happen of course, we're not talking about Windows are we...?

Anyway, thank you so much for your help so far! You pointed me to the right direction, now I can continue the migration. This was only a test setup so far, the real one is with multiple phase2 entries plus there are other more complex tunnels. I'll see how far I can get now without having to show up here again. If I encounter a situation like this again I'll try to figure out if/how the problem can be reproduced.

Quote from: Colani1200 on April 14, 2021, 10:59:42 PM
Anyway, thank you so much for your help so far!

You're welcome :)
Generally speaking I have often found restarting the IPSEC service is a good way to solve strange and otherwise inexplicable IPSEC problems.

Guess what, yesterday I reconfigured the peer to use DynDNS again. Today the IP address has changed, but the tunnel endpoint entry of the manual SPD in the database still points to the old IP address. A restart of the IPsec service doesn't help, only a reboot. Looks like it is time for a bug report.