1
23.1 Legacy Series / Re: Wireguard kernel not working like it should
« on: January 28, 2023, 05:40:23 pm »
I'm experiencing a similar issue after upgrading to 23.1 where Wireguard handshakes are timing out when at home and decided to do some debugging.
My Android Wireguard client is setup pointing at a hostname, vpn.mydomain.com:51820, which is an A record pointing at my public IP, and I use the Always-on VPN feature in Android on this tunnel. I have all 3 of the NAT Reflection settings in OPNsense's settings (under Firewall > Settings > Advanced) turned on.
igb1 is my LAN, igb2 is my WAN, and wg1 is the Wireguard interface. When I caught the Android client sending handshakes and timing out, I turned on debugging for wg1 (ifconfig wg1 debug) which showed that OPNsense was receiving the handshake and sending a reply to the client, which led to me dig deeper.
wg1: Receiving handshake initiation from peer 1
wg1: Sending handshake response to peer 1
wg1: Receiving handshake initiation from peer 1
wg1: Sending handshake response to peer 1
I checked tcpdump on igb1 and I was able to see the handshake packets from my phone (192.168.1.68) directed to my public IP (let's call it 203.0.113.7), however there was no traffic flowing back to the phone:
root@opnsense:~ # tcpdump -nn -i igb1 host 192.168.1.68 and port 51820
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
10:45:49.935640 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
10:45:54.967559 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
10:46:03.355900 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
10:46:11.883729 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
I then checked igb2 and noticed that it is sending the traffic destined for my LAN out the WAN interface:
root@opnsense:~ # tcpdump -nn -i igb2 host 192.168.1.68 and port 51820
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb2, link-type EN10MB (Ethernet), capture size 262144 bytes
11:14:38.316335 IP 203.0.113.7.51820 > 192.168.1.68.35190: UDP, length 92
11:14:43.354698 IP 203.0.113.7.51820 > 192.168.1.68.35190: UDP, length 92
It seems that after some indeterminate period of time, wireguard-kmod forgets what interface it should be replying on and ignores the NAT Reflection rules. If I disconnect the Android client and reconnect, everything goes back to normal and it no longer tries to send traffic out the wrong interface.
This exclusively happens on wireguard-kmod because I've have absolutely no issues with wireguard-go. I also don't believe this is a 23.1-specific issue because I experienced the same thing on 22.7 a few months back when I tried to switch to wireguard-kmod, but ultimately had to revert back to wireguard-go.
Hopefully this is enough detail for a developer to reproduce my issue. If you have any questions or need further clarification, please let me know.
My Android Wireguard client is setup pointing at a hostname, vpn.mydomain.com:51820, which is an A record pointing at my public IP, and I use the Always-on VPN feature in Android on this tunnel. I have all 3 of the NAT Reflection settings in OPNsense's settings (under Firewall > Settings > Advanced) turned on.
igb1 is my LAN, igb2 is my WAN, and wg1 is the Wireguard interface. When I caught the Android client sending handshakes and timing out, I turned on debugging for wg1 (ifconfig wg1 debug) which showed that OPNsense was receiving the handshake and sending a reply to the client, which led to me dig deeper.
wg1: Receiving handshake initiation from peer 1
wg1: Sending handshake response to peer 1
wg1: Receiving handshake initiation from peer 1
wg1: Sending handshake response to peer 1
I checked tcpdump on igb1 and I was able to see the handshake packets from my phone (192.168.1.68) directed to my public IP (let's call it 203.0.113.7), however there was no traffic flowing back to the phone:
root@opnsense:~ # tcpdump -nn -i igb1 host 192.168.1.68 and port 51820
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
10:45:49.935640 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
10:45:54.967559 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
10:46:03.355900 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
10:46:11.883729 IP 192.168.1.68.35190 > 203.0.113.7.51820: UDP, length 148
I then checked igb2 and noticed that it is sending the traffic destined for my LAN out the WAN interface:
root@opnsense:~ # tcpdump -nn -i igb2 host 192.168.1.68 and port 51820
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb2, link-type EN10MB (Ethernet), capture size 262144 bytes
11:14:38.316335 IP 203.0.113.7.51820 > 192.168.1.68.35190: UDP, length 92
11:14:43.354698 IP 203.0.113.7.51820 > 192.168.1.68.35190: UDP, length 92
It seems that after some indeterminate period of time, wireguard-kmod forgets what interface it should be replying on and ignores the NAT Reflection rules. If I disconnect the Android client and reconnect, everything goes back to normal and it no longer tries to send traffic out the wrong interface.
This exclusively happens on wireguard-kmod because I've have absolutely no issues with wireguard-go. I also don't believe this is a 23.1-specific issue because I experienced the same thing on 22.7 a few months back when I tried to switch to wireguard-kmod, but ultimately had to revert back to wireguard-go.
Hopefully this is enough detail for a developer to reproduce my issue. If you have any questions or need further clarification, please let me know.