1
Virtual private networks / Wireguard randomly changes endpoint address to 192.168.1.1 and the tunnel faults
« on: March 13, 2024, 12:31:54 am »
Hello folks!
I have a problem with my wireguard setup running on OPNsense.
About my setup:
I use the opnsense-business Version 23.10.2 on a OPNsense DEC2685 and I have a router (Teltonika RUTX50) before my firewall which is connected to the igb0 WAN port.
The Teltonika router is connected to a DSL modem on it's WAN port and has a SIM-card installed, it does auto-failover from WAN to mobile whenever the WAN port is loosing the connection to the internet.
To be able to connect to the Wireguard server on my OPNsense whenever it switches to mobile network (with CGNAT), I use an external VPS with Wireguard and NAT with the following configuration:
The connection (via the VPS) to my OPNsense works well in general, both with the firewall connected to internet via DSL and mobile.
But… sometimes, the tunnels faults with no particular reason. When this happens, the wireguard config on the OPNsense looks like this:
192.168.1.1 is the gateway IP address of the router before the firewall I mentioned above:
When this happens, I need to restart Wireguard on the OPNsense in order to repair the tunnel.
For the restart of Wireguard, I use the following command:
The endpoint address then changes back to the correct address:
Mostly, this is enough to bring the tunnel back and to be able to connect a wireguard-client to the server. In some cases, this is not enough and it is necessary to generate traffic over the tunnel first to be able to connect a client. I do this by simply performing ICMP traffic to the other side (VPS) from the OPNsense or backwards.
To mitigate this, and for testing purposes, I use the following cronjobs:
This is my Instance (wireguard server) config on the OPNsense:
And this is the peer (wireguard client) config for the OPNsense which connects to the wireguard server running on my VPS:
As you can see, the endpoint address is a domain which points to my VPS. Could this be a potential problem?
My assumption is that, whysoever, the DNS on the OPNsense fails. Maybe this happens when the failover to mobile is initiated by the Teltonika router and the name resolution for my endpoint address on the OPNsense then results in 192.168.1.1
My goal was to manufacture a setup which provides high-availabilty for the switch that is connected to my firewall, therefore, the router before the OPNsense with the ability to insert a sim-card and with the option to do an automatic fail-over is mandatory. Unfortunately, DSL and cellular is the only way to connect to the internet on my "datacenter" location.
Any thoughts on this are appreciated and feel free to ask questions if something is unclear.
Thanks!
I have a problem with my wireguard setup running on OPNsense.
About my setup:
I use the opnsense-business Version 23.10.2 on a OPNsense DEC2685 and I have a router (Teltonika RUTX50) before my firewall which is connected to the igb0 WAN port.
The Teltonika router is connected to a DSL modem on it's WAN port and has a SIM-card installed, it does auto-failover from WAN to mobile whenever the WAN port is loosing the connection to the internet.
To be able to connect to the Wireguard server on my OPNsense whenever it switches to mobile network (with CGNAT), I use an external VPS with Wireguard and NAT with the following configuration:
Code: [Select]
[Interface]
PrivateKey = REDACTED
ListenPort = 55107
Address = 10.99.99.2/32
# port forwarding wireguard ports
PostUp = iptables -t nat -A PREROUTING -p udp -i eth0 -m multiport --dport 55551,55552,55553,55554,55555,55556 -j DNAT --to-destination 10.99.99.1
PostDown = iptables -t nat -D PREROUTING -p udp -i eth0 -m multiport --dport 55551,55552,55553,55554,55555,55556 -j DNAT --to-destination 10.99.99.1
# packet masquerading
PreUp = iptables -t nat -A POSTROUTING -o wg0 -j MASQUERADE
PostDown = iptables -t nat -D POSTROUTING -o wg0 -j MASQUERADE
[Peer]
PublicKey = REDACTED
AllowedIPs = 10.99.99.1/32
Endpoint = $DYNDNS_OF_MY_OPNSENSE:55599
The connection (via the VPS) to my OPNsense works well in general, both with the firewall connected to internet via DSL and mobile.
But… sometimes, the tunnels faults with no particular reason. When this happens, the wireguard config on the OPNsense looks like this:
Code: [Select]
root@opnsense01:~ # wg show wg2
interface: wg2
public key: REDACTED
private key: (hidden)
listening port: 55599
peer: REDACTED
endpoint: 192.168.1.1:55107
allowed ips: 10.99.99.2/32
latest handshake: 1 minute, 22 seconds ago
transfer: 5.61 MiB received, 8.44 MiB sent
192.168.1.1 is the gateway IP address of the router before the firewall I mentioned above:
When this happens, I need to restart Wireguard on the OPNsense in order to repair the tunnel.
For the restart of Wireguard, I use the following command:
Code: [Select]
/usr/local/sbin/pluginctl -s wireguard restart
The endpoint address then changes back to the correct address:
Code: [Select]
interface: wg2
public key: REDACTED
private key: (hidden)
listening port: 55599
peer: REDACTED
endpoint: $CORRECT_IP:55107
allowed ips: 10.99.99.2/32
latest handshake: 16 seconds ago
transfer: 5.86 MiB received, 10.24 MiB sent
Mostly, this is enough to bring the tunnel back and to be able to connect a wireguard-client to the server. In some cases, this is not enough and it is necessary to generate traffic over the tunnel first to be able to connect a client. I do this by simply performing ICMP traffic to the other side (VPS) from the OPNsense or backwards.
To mitigate this, and for testing purposes, I use the following cronjobs:
Code: [Select]
root@opnsense01:~ # crontab -l | tail -n2
*/1 * * * * (/usr/local/bin/keepalive_wg_tunnel.sh) > /dev/null
*/5 * * * * (ping -c 5 10.99.99.2) > /dev/null
root@opnsense01:~ # cat /usr/local/bin/keepalive_wg_tunnel.sh
#!/bin/bash
OPNSENSE_IP=$(/usr/local/bin/dig @1.1.1.1 +short wg01.some.domain)
WG_ENDPOINT_IP=$(/usr/bin/wg show wg2|/usr/bin/grep endpoint|/usr/bin/cut -d ":" -f2|xargs)
LOGFILE="/var/log/keepalive_wg.log"
TIMESTAMP=`/bin/date "+%Y-%m-%d %H:%M:%S"`
if [ "$OPNSENSE_IP" == "$WG_ENDPOINT_IP" ] ; then
echo "DEBUG: IPs are equal, nothing to do..."
else
echo "DEBUG: IPs are not equal, reloading wireguard config"
echo "$TIMESTAMP | BEGIN Failover restart | OPNsense: $OPNSENSE_IP | WG_endpoint: $WG_ENDPOINT_IP" >> $LOGFILE
/usr/local/sbin/pluginctl -s wireguard restart
/sbin/ping -c 5 10.99.99.2
echo "$TIMESTAMP | END Failover restart | OPNsense: $OPNSENSE_IP | WG_endpoint: $WG_ENDPOINT_IP" >> $LOGFILE
fi
This is my Instance (wireguard server) config on the OPNsense:
And this is the peer (wireguard client) config for the OPNsense which connects to the wireguard server running on my VPS:
As you can see, the endpoint address is a domain which points to my VPS. Could this be a potential problem?
My assumption is that, whysoever, the DNS on the OPNsense fails. Maybe this happens when the failover to mobile is initiated by the Teltonika router and the name resolution for my endpoint address on the OPNsense then results in 192.168.1.1
My goal was to manufacture a setup which provides high-availabilty for the switch that is connected to my firewall, therefore, the router before the OPNsense with the ability to insert a sim-card and with the option to do an automatic fail-over is mandatory. Unfortunately, DSL and cellular is the only way to connect to the internet on my "datacenter" location.
Any thoughts on this are appreciated and feel free to ask questions if something is unclear.
Thanks!