SNAT and Virtual Tunnel Interfaces - An in depth analysis

Started by Monviech (Cedrik), October 07, 2023, 11:10:37 AM

Previous topic - Next topic
I have posted this in the freebsd forum in hopes of getting some more insight: FreeBSD Forum

Additional tests with IPFW and NATD, and IPFW and IPFW_NAT: https://forum.opnsense.org/index.php?topic=36456.0

Since you can't do SNAT on VTI ipsecXX interfaces without some tunables set (which break enc0 firewalling and natting) I thought to be clever and go on a quest to find a working routing solution with SNAT and firewalling over policy based tunnels.

My eyes fell on GRE:

- I created a GRE tunnel inside an IPsec Policy Based VPN in Transport Mode.
- Routing into the GRE interfaces worked, traffic flowed as expected (there even was an automatic GRE Gateway created for the static routes, neat)
- SNAT into the GRE interface worked, but once the reply packages traffic flowed back from site B to site A, they got stuck in the GRE interface on site A and never returned to the initiator of the request.

So all in all, with those tunnel interfaces the echo request and echo reply worked with SNAT, but on the final step where it would have to get out of the tunnel interface back to the original initiator, the flow just stops. As if pf forgot there's a NAT Table...?

=================================================================================================================

       +-------------------+                 +------------------+
       |  OPNsense Site A  |                 |  OPNsense Site B |
       |-------------------|                 |------------------|
       | LAN  192.168.1.1  |                 | LAN  192.168.2.1 |
       | OPT1 192.168.101.1| IPsec Transport |                  |
       | WAN  172.16.11.2  | =============== | WAN  172.16.11.3 |
       | gre0 10.21.1.2    | --------------- | gre0 10.21.1.3   |
       +-------|-----------+   GRE Tunnel    +-------|----------+
               |                                     |   
               |                                     |
               |                                     |
               |                                     |
               | 
         LAN 192.168.1.100
              --OR--
         OPT1 192.168.101.100                 LAN 192.168.2.3
           [Host Site A]                       [Host Site B]
  (Only one interface is connected)

-----------------------------------------------------------------------------------------------------------------

The setup is like this:

OPNsense Site A:
Interfaces:

LAN (hn0)       -> v4: 192.168.1.1/24
OPT1 (hn2)      -> v4: 192.168.101.1/24
WAN (hn1)       -> v4: 172.16.11.2/24
gre10 (gre0)    -> v4: 10.21.1.2/24 - MTU 1398


Routes

Internet:
Destination        Gateway            Flags     Netif Expire
10.19.1.2          link#9             UH          lo1
10.21.1.3          link#8             UH         gre0
127.0.0.1          link#1             UH          lo0
172.16.11.0/24     link#6             U           hn1
172.16.11.2        link#6             UHS         lo0
192.168.1.0/24     link#5             U           hn0
192.168.1.1        link#5             UHS         lo0
192.168.2.0/24     10.21.1.3          UGS        gre0
192.168.101.0/24   link#7             U           hn2
192.168.101.1      link#7             UHS         lo0


PF filter rules

pass out log all flags S/SA keep state allow-opts label "fae559338f65e11c53669fc3642c93c2"
pass out log on enc0 all flags S/SA keep state label "c1eff64cbafdd6b80448f92cd4aff7e5"
pass out log route-to (gre0 10.21.1.3) inet from (gre0) to ! (gre0:network) flags S/SA keep state allow-opts label "f7f077b5334caa29bc835d174f88b548"
pass in quick inet all flags S/SA keep state label "523ba68a597fc0e535b425d2ef260b6b"
pass in quick inet6 all flags S/SA keep state label "523ba68a597fc0e535b425d2ef260b6b"


PF nat rules

nat on gre0 inet from (hn2:network) to 192.168.2.0/24 -> (hn0:0) port 1024:65535


Swanctl SAs
(Ignore the 0 bytes, it works but I didn't pass traffic when I made this --list-sas)

21924229-8cb2-496e-b1d7-26cc4dc35f7d: #4, ESTABLISHED, IKEv2, e51ca149a80e6d9a_i* 4800a403859d825a_r
  local  '172.16.11.2' @ 172.16.11.2[4500]
  remote '172.16.11.3' @ 172.16.11.3[4500]
  AES_CBC-128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/ECP_256
  established 14s ago, rekeying in 13406s
  e200d362-fc18-4844-aa62-0529148faa58: #3, reqid 1, INSTALLED, TRANSPORT, ESP:AES_CBC-128/HMAC_SHA2_256_128
    installed 14s ago, rekeying in 3247s, expires in 3946s
    in  cbbd2ddd,      0 bytes,     0 packets
    out ca6ec24e,      0 bytes,     0 packets
    local  172.16.11.2/32
    remote 172.16.11.3/32


Host Site A (Ubuntu):

LAN 192.168.1.100
              --OR--
OPT1 192.168.101.100


-----------------------------------------------------------------------------------------------------------------

OPNsense Site B:


LAN (hn0)       -> v4: 192.168.2.1/24
WAN (hn1)       -> v4: 172.16.11.3/24
gre10 (gre0)    -> v4: 10.21.1.3/24 - MTU 1398


Routes

Internet:
Destination        Gateway            Flags     Netif Expire
10.19.1.3          link#8             UH          lo1
10.21.1.2          link#7             UH         gre0
127.0.0.1          link#1             UH          lo0
172.16.11.0/24     link#6             U           hn1
172.16.11.3        link#6             UHS         lo0
192.168.1.0/24     10.21.1.2          UGS        gre0
192.168.2.0/24     link#5             U           hn0
192.168.2.1        link#5             UHS         lo0


PF filter rules

pass out log all flags S/SA keep state allow-opts label "fae559338f65e11c53669fc3642c93c2"
pass out log on enc0 all flags S/SA keep state label "c1eff64cbafdd6b80448f92cd4aff7e5"
pass out log route-to (gre0 10.21.1.2) inet from (gre0) to ! (gre0:network) flags S/SA keep state allow-opts label "64abd34393f7bf3840c44e806a347bf6"
pass in quick inet all flags S/SA keep state label "357faa0befdb804e3fe5f8345c9b76c7"
pass in quick inet6 all flags S/SA keep state label "357faa0befdb804e3fe5f8345c9b76c7"


PF nat rules

NONE


Swanctl SAs
(Ignore the 0 bytes, it works but I didn't pass traffic when I made this --list-sas)

fa5ea186-0bb5-43fe-b570-15dd7f1b728e: #3, ESTABLISHED, IKEv2, e51ca149a80e6d9a_i 4800a403859d825a_r*
  local  '172.16.11.3' @ 172.16.11.3[4500]
  remote '172.16.11.2' @ 172.16.11.2[4500]
  AES_CBC-128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/ECP_256
  established 68s ago, rekeying in 13094s
  a78735eb-f23b-4284-a4e3-30ddc2e474a2: #3, reqid 1, INSTALLED, TRANSPORT, ESP:AES_CBC-128/HMAC_SHA2_256_128
    installed 68s ago, rekeying in 3483s, expires in 3892s
    in  ca6ec24e,      0 bytes,     0 packets
    out cbbd2ddd,      0 bytes,     0 packets
    local  172.16.11.3/32
    remote 172.16.11.2/32


Host Site B:

LAN  192.168.2.3


=================================================================================================================

Now that the setup is described, here come the packet captures:

First a Sanity Check that the routing itself works as expected.

This package capture shows:
- Host Site A (192.168.1.100) initiating an ICMP echo request to Host Site B (192.168.2.3)

Packet is received on LAN interface of OPNsense A from Host A

OPNsense A:~ # tcpdump -i hn0 proto ICMP -n
08:47:05.654822 IP 192.168.1.100 > 192.168.2.3: ICMP echo request, id 45, seq 1, length 64
08:47:05.656662 IP 192.168.2.3 > 192.168.1.100: ICMP echo reply, id 45, seq 1, length 64

Packet is forwarded into GRE tunnel on OPNsense A

OPNsense A:~ # tcpdump -i gre0 proto ICMP -n
08:47:05.654909 IP 192.168.1.100 > 192.168.2.3: ICMP echo request, id 45, seq 1, length 64
08:47:05.656581 IP 192.168.2.3 > 192.168.1.100: ICMP echo reply, id 45, seq 1, length 64

Packet is received by GRE tunnel on OPNsense B

OPNsense B:~ # tcpdump -i gre0 proto ICMP -n
08:47:05.655683 IP 192.168.1.100 > 192.168.2.3: ICMP echo request, id 45, seq 1, length 64
08:47:05.656101 IP 192.168.2.3 > 192.168.1.100: ICMP echo reply, id 45, seq 1, length 64

Packet is forwarded to LAN on OPNsense B and sent to Host B

OPNsense B:~ # tcpdump -i hn0 proto ICMP -n
08:47:05.655790 IP 192.168.1.100 > 192.168.2.3: ICMP echo request, id 45, seq 1, length 64
08:47:05.656069 IP 192.168.2.3 > 192.168.1.100: ICMP echo reply, id 45, seq 1, length 64

Echo Request and Echo Reply route as expected between Host A and Host B

-----------------------------------------------------------------------------------------------------------------

Now with SNAT, which DOESN't WORK as expected:

This package captures shows:
- Host Site A (192.168.101.100) initiating an ICMP echo request to Host Site B (192.168.2.3)
- and getting SNATed onto the Interface IP of OPNsense Site A LAN address (192.168.1.1)

Heres the states of the firewalls:
OPNsense A:~ # pfctl -ss | grep -i icmp
all icmp 192.168.2.3:49 <- 192.168.101.100:49       0:0
all icmp 192.168.1.1:45468 (192.168.101.100:49) -> 192.168.2.3:45468       0:0

OPNsense B:~ # pfctl -ss | grep -i icmp
all icmp 192.168.1.1:45468 -> 192.168.2.3:45468       0:0
all icmp 192.168.2.3:45468 -> 192.168.1.1:45468       0:0



Packet is received on OPT1 interface of OPNsense A from Host A

OPNsense A:~ # tcpdump -i hn2 proto ICMP -n
08:55:12.675487 IP 192.168.101.100 > 192.168.2.3: ICMP echo request, id 47, seq 1, length 64

Packet is forwarded into GRE tunnel on OPNsense A and SNATed to the LAN interface IP 192.168.1.1

OPNsense A:~ # tcpdump -i gre0 proto ICMP -n
08:55:12.675565 IP 192.168.1.1 > 192.168.2.3: ICMP echo request, id 41970, seq 1, length 64
08:55:12.676751 IP 192.168.2.3 > 192.168.1.1: ICMP echo reply, id 41970, seq 1, length 64

Packet is received by GRE tunnel on OPNsense B

OPNsense B:~ # tcpdump -i gre0 proto ICMP -n
08:55:12.675977 IP 192.168.1.1 > 192.168.2.3: ICMP echo request, id 41970, seq 1, length 64
08:55:12.676516 IP 192.168.2.3 > 192.168.1.1: ICMP echo reply, id 41970, seq 1, length 64

Packet is forwarded to LAN on OPNsense B and sent to Host B

OPNsense B:~ # tcpdump -i hn0 proto ICMP -n
08:55:12.676171 IP 192.168.1.1 > 192.168.2.3: ICMP echo request, id 41970, seq 1, length 64
08:55:12.676489 IP 192.168.2.3 > 192.168.1.1: ICMP echo reply, id 41970, seq 1, length 64

The Echo Reply comes all the way back to OPNsense A gre0 - But there it won't be translated back by the NAT table and forwarded back to the hn2 OPT interface

Also I tried it with a vxlan interface too, same behavior.

=================================================================================================================

If you got this far, thank you for reading this post. It took some time to write, so it took some time to read too :)

I hope there's somebody here with some in depth knowledge who will confirm that SNATing into GRE tunnel interfaces won't work because of some PF limitations. (pf is a firewall implementation in FreeBSD that both pfsense and opnsense use.)
Hardware:
DEC740

October 07, 2023, 12:17:13 PM #1 Last Edit: October 08, 2023, 09:11:57 AM by Monviech
Alright now I'm radically confused, when I set the tunables for IPsec to filter on VTI interfaces... ALL other virtual tunnel interfaces (gre, vxlan, gif, etc...) ALSO start to work with SNAT.

System: Settings: Tunables:
Name: net.enc.in.ipsec_filter_mask
Value: 0
Name: net.enc.out.ipsec_filter_mask
Value: 0
Name: net.inet.ipsec.filtertunnel
Value: 1
Name: net.inet6.ipsec6.filtertunnel
Value: 1

Why do they have ipsec in their name if they also change the behavior of other virtual tunnel interfaces?
Hardware:
DEC740

October 07, 2023, 03:25:11 PM #2 Last Edit: October 18, 2023, 10:30:55 PM by Monviech
I have expanded my test to wireguard.

With wireguard, I have created the exact setup as with GRE over IPsec Transport Mode.

Wireguard over IPsec Transport Mode: (I know its already encrypted, it was just a sanity check)

- Routing works as expected
- SNAT into the tunnel works (without the tunables from above)
- The tunables from above don't influence the wgXX interfaces.

That must mean the following:

- The FreeBSD Network Stack and PF treat "greXX, vxlanXX, gifXX, ipsecXX" all as the same kind of virtual tunnel interfaces. These interfaces are all influenced by the net.inet.ipsec.filtertunnel tunable.

- The FreeBSD Network Stack and PF treat "wgXX" as their own kind of virtual tunnel interface. This interface isn't influenced at all by the net.inet.ipsec.filtertunnel tunable.


The result of these tests is the following:

- There is currently NO WAY to get a filtered and NATed, routed and policy based VPN at the same time with IPsec, even with tricks like GRE, VXLAN and GIF over IPsec.

- The only way to get a filtered and NATed policy based AND routed tunnel is to use policy based IPsec and routed Wireguard side by side.

- I didn't test a routed OpenVPN tunnel, but I imagine the same results as with wireguard.

Edit: Redacted some false information.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248474.
Hardware:
DEC740

Aloha :)

Is this still true up to today? I am currently trying to move a route based tunnel from a Sophos FW to OpnSense which is using SNAT/DNAT on top and dont get it working. I see that the BGP routes are working and i see that OPNSense sends packets over with the correct NAt IP but it seems there is nothing coming back.

Heres an example setup and more informstion in the following posts that show NAT working for route based tunnels.

https://forum.opnsense.org/index.php?topic=36254.msg176819#msg176819
Hardware:
DEC740