IPsec: UDP packets do not find their way back to the socket

Started by schnipp, March 13, 2023, 09:10:59 PM

Previous topic - Next topic
I run two IPsec S2S tunnels to different locations (another Opnsense instance and a Fritzbox). At each tunnmel endpoint operates a DNS as an authorative DNS for internal domains.

1. Opnsense <------ IPsec (IKEv1) ------> Fritzbox 7490
2. Opnsense <------ IPsec (IKEv2) ------> Opnsense


These DNS are used by the unbound instance of my Opnsense via Domain overide. I noticed that the unbound instance on my Opnsense sporadically has communication issues with the DNS of the Fritzbox. This lead me to start investigation. I reconfigured unbound on my local instance to use TCP for upstream DNS of the Fritzbox instead of UDP (by directly modifying the unbound config). Result was, that DNS communication to the Fritzbox now runs fine. The drawback is that the modification of the unbound config is handcrafted which results in being regularly overwritten by the Opnsense due to Opnsense is not aware of this config parameter.


Summarized:
==========

  • TCP and UDP communication to the remote Opnsense (2) over the IPsec tunnel runs fine
  • TCP communication to the Fritzbox (1) over the IPsec tunnel runs fine
  • UDP communication to the Fritzbox (2) over the IPsec tunnel seems to have issues

I concentrated on the IPsec tunnel number 1 (Fritzbox) and investigated further by manually triggering DNS lookups with the "dig" tool in different scenarios:


  • Direct DNS lookups to the Fritzbox using TCP and Opnsense as the endpoint (Opnsense command line):  --> works fine

      $ dig +tcp -b 10.2.100.1 @192.168.1.1 fritz.box

  • Direct DNS lookups to the Fritzbox using UDP and Opnsense as the endpoint (Opnsense command line):  --> times out

      $ dig +notcp -b 10.2.100.1 @192.168.1.1 fritz.box

  • Direct DNS lookups to the Fritzbox using TCP and UDP and a client computer behind the Opnsense:  --> both work fine

      $ dig +tcp -b 10.2.100.123 @192.168.1.1 fritz.box
      $ dig +notcp -b 10.2.100.123 @192.168.1.1 fritz.box


A packet dump shows that the IPsec tunnel works fine and all packets traversing it are correctly decrypted. Even in the 2 scenario (which times out) the Fritzbox DNS correctly sends a response back to my local Opnsense instance. And packet dumps of UDP DNS responses from scenario 2 and 3 look the same.


The question is, why the DNS responses of scenario 2 do not reach the local socket. The firewall rules cannot be the cause of this issue because DNS request and response packets traverse the IPsec tunnel. Furthermore...


  • Every DNS requests creates a related entry in the state table of the firewall
  • Response packets have a size of around 200 bytes and are far away from exceeeding the MTU
  • The UDP issue only sporadically occurs. There are somne days scenario 2 works fine and some days not

I also was not able to force the one or other result of scenario 2 by restarting the IPsec tunnel, IPsec service or the WAN connection.
This issues is not related to Opnsense 23.x. only. It may have occurred in earlier versions. I don't exactly know.


Does anybody has a clue what's going on?
OPNsense 24.7.11_2-amd64

I did a four day continuous monitoring by sending DNS requests every 10 minutes over the IPsec tunnel to the Fritzbox using TCP and UDP. the result is that TCP works flawlessly whereas UDP sometimes works and sometimes not. It looks like the UDP related issue may change only during reauthentication phase of the IPsec tunnel.

In my eyes it could be a race condition somewhere in FreeBSD or OPnsense. Can anybody give me a hint, how to debug further?
OPNsense 24.7.11_2-amd64

I did some further investigation but still not have a clue why the response packets do not reach the socket.


  • I decoded and checked integrity of all IPsec packets in wireshark:

    • The MAC (Message Authentication Code) was valid
    • Wireshark was able to successfully decode all IPsec packets
    •    -->Result: Fritzbox operates correctly

  • Dumping the SPD entries (setkey -DP) of the IPsec connection on the Opnsense console showed the following:

    • DNS request with UDP: "last used" entry of SPDs is only update in direction Opnsense -> Fritzbox, not in the other direction. This shows the response packet has been dropped
    • DNS request with TCP: "last used" entry of SPDs are updated in both directions. Response packets are successfully delivered back to the socket (dig command)
Does anybody have an idea how to debug further?
OPNsense 24.7.11_2-amd64

Is anybody using IPsec site-to-site connections together with UDP inside the tunnel or can anybody give me a hint how to debug further?
OPNsense 24.7.11_2-amd64

An IPsec tunnel is agnostic of the protocol inside. As long as it's IP. Use tcpdump to trace the packets:

source incoming interface
source tunnel interface
destination tunnel interface
destination outgoing interface

They get lost or are blocked SOMWHERE. Depending on that somewhere a firewall rule, NAT rule, ... might be at work here.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: pmhausen on March 26, 2023, 07:41:09 PM
An IPsec tunnel is agnostic of the protocol inside. As long as it's IP. Use tcpdump to trace the packets:

Thanks for your answer. That's clear but does not solve the problem

Quote from: pmhausen on March 26, 2023, 07:41:09 PM
source incoming interface
source tunnel interface
destination tunnel interface
destination outgoing interface

They get lost or are blocked SOMWHERE. Depending on that somewhere a firewall rule, NAT rule, ... might be at work here.

It looks like you haven't read the whole thread because the issue only occurs sporadically and firewall rules, NAT rules and state table entries are fine.
OPNsense 24.7.11_2-amd64

Sorry. I overlooked you had already used Wireshark. Weird.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Indeed, this behavior is really strange. The only idea I actually have is that somewhere probably a race condition occurs. The next step I'll try is to enable sysctl "net.inet.ipsec.debug" to get more insights via dmesg, provided that the kernel or the ipsec module has debugging support compiled in.
OPNsense 24.7.11_2-amd64

I remember a problem with udp fragments and radius EAP over the tunnel. Really strange problem

Quote from: mimugmail on March 27, 2023, 08:54:43 AM
I remember a problem with udp fragments and radius EAP over the tunnel. Really strange problem

IPSEC-only? Or even openVPN/WG?
kind regards
chemlud
____
"The price of reliability is the pursuit of the utmost simplicity."
C.A.R. Hoare

felix eichhorns premium katzenfutter mit der extraportion energie

A router is not a switch - A router is not a switch - A router is not a switch - A rou....

I did an additional test with the failing command on Opnsense:

  $ dig +notcp -b 10.2.100.1 @192.168.1.1 fritz.box

I redirected the DNS request from Opnsense to the Fritzbox (through the IPsec tunnel) to a transparent UDP proxy running on a client computer which itself forwards the requests. Doing so, everything works well. So my conclusion is, that there is something broken with correct processing of the IPsec security policy in conjunction with the routing to a local endpoint. I think, I have to raise a ticket in the strongswan project.
OPNsense 24.7.11_2-amd64

Info: ticket raised in the strongswan project: https://github.com/strongswan/strongswan/issues/1635

Update:
As discussed with one of the strongswan maintainers raising an issue in the strongswan project is not correct, because the IPsec stack is covered by FreeBSD itself. I'll try to discuss the bug via the FreeBSD bug tracker
OPNsense 24.7.11_2-amd64