Wireguard issue(s)

Started by Dark-Sider, January 20, 2026, 11:15:08 PM

Previous topic - Next topic
January 20, 2026, 11:15:08 PM Last Edit: January 20, 2026, 11:36:36 PM by Dark-Sider
Hi,

since wireguard made its way into opnsense it works ok-ish for however its "stability" is not comparable to OpenVPN. However I like the concept behind wireguard therefore I'm putting up with some issues and still using it.

Last week I had to restart my opnSense box (25.1.12) and me being away on a business trip, wireguard failed me (again) after the restart. I usually solve this issue by de- and reactivating my one and only wg0 instance via the webgui. After restarting wg0 everything works as it is supposed to.

Since the issue caught me cold (again) I did some forum reading and found interesting threads regarding wg and DNS, stale connections etc:
https://forum.opnsense.org/index.php?topic=49432.0
https://forum.opnsense.org/index.php?topic=37905.0
https://forum.opnsense.org/index.php?topic=42648.0

Honestly I didn't know about the quirks of wg and DNS resolve issues after your dynamic IP refreshes or wg only doing DNS queries once on startup and not refreshing it. One might argue that using a static ip would solve such problems, however static IPs on consumer lines are hard to get these days. Even IPv6 is dynamic with my ISP.

While I think wg's behaviour is a severe design oversight in the protocol / moudule (nothing related to opnsense though) I appreciate the effort that a cron job exists that somewhat is supposed to fix the issue.

I activated the cron-job to run */5 * * * * however my issue was not resolved. My mobile phone was not able to connect via IPv6 or IPv4 (both usually works) to my opnSense box. I did a packet capture on 51820 and the packets from phone arrived but no response was sent back.

I then noticed there is another cron-job called "restart wireguard service" I also did setup this job */7 * * * * however after waiting for 14 minutes my wireguard log still showed that the service was started last week - no other log entries.

While looking at the logs I found that the wg status page was quite empty, only showing wg0 with my local endpoint at port 33###. Didn't notice this first, but my wg setup uses only port 51820. Also no peers were shown at all on the status page.

I have 3 road warrior peers configured ("dial in only") being my phone, my laptop and a mobile gl_inet travel router. I also have a site2site connection configured to a remote network.

Only after I deactivate my Instance and reactivate it, all 4 peers will be listed on the status page. When the peers are listed the connections start working again.

My openSense runs virtualized (yes it could need a firmware update which I will do later) and is on a dial up connection at a German ISP (M-Net) using both IPv4 and IPv6 connectivity via pppoe. Luckily my ISP-connection is hyper-stable so reboots and disconnects thus ip-changes happen very rarely.

I still wonder why wg needs a kick in the... after my box boots up? And shouldn't that restart wg cron-job also fix my issue?

thanks,
Dark-Sider

If your connection via wireguard is inbound, it means that your devices resolve the DNS entry. On your OPNsense, I would not expect any hostname in the configuration, wireguard just binds to all interfaces (* 51820).

So I assume your clients resolve the wrong IP address and send packets to the wrong destination.
Hardware:
DEC740

January 21, 2026, 09:41:56 AM #2 Last Edit: January 21, 2026, 12:26:15 PM by meyergru
Hi Fabian, I think the problem is the road warriors (as they do not notice the IP change of your server).

I wrote the stale connection job and of course, it only does half of the job...

Wireguard is inherently symmetrical, such that when a potential connection exist, both peers will send handhakes. If one side changes its IP, the other side will lose contact, but it can still try to reach the other side - this is normally true with a site-2-site connection, including new DNS resolution.

Thus, it is usually beneficial if one side (the server, so to speak) has a fixed IP, because it will always come up on that IP and can be reached regardless. The cron job fixes a lot of thing when both site-2-site partnes are behind dynamic IPs and both use the job. In that case, regardless of who is changing its IP, the other side will notice that the last handhake was older than 135 seconds and re-initiate the connection.

This can take a while, depending on how fast the DynDNS updates are and may be neccessary multiple times.


In your scenario, with a road warrior, the cron job on the "server" side does not help at all, because the road warrior peer has to initiate a new connection. If he fails to notice the stale connection, it will never restart and thus the DynDNS update will go unnoticed.

Luckily, for M-Net, there is a fix to that: They usually do not change their IPv6 prefix on reconnection, much unlike the IPv4. Thus, if you use IPv6 only or IPv6 and IPv4 in your DynDNS, it will effectively work as if you have a fixed IP(v6). In that case, your roadwarrior clients will regain contact within the same Wireguard session.

Call me any time if you have questions, you have my number!
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

For Android there is "WG Tunnel", that can cope with dynamic IPs. If your resolution is to restart WG on OPNsens though, you might have another problem und upgrading OPNsense is strongly advised to begin with. 

January 21, 2026, 09:56:50 PM #4 Last Edit: January 21, 2026, 10:02:49 PM by Dark-Sider
Thanks for your replies thus far! However it's not that simple (see below)

Quote from: Monviech (Cedrik) on January 21, 2026, 09:14:10 AMIf your connection via wireguard is inbound, it means that your devices resolve the DNS entry. On your OPNsense, I would not expect any hostname in the configuration, wireguard just binds to all interfaces (* 51820).

So I assume your clients resolve the wrong IP address and send packets to the wrong destination.

It's not a road-warrior-only setup. The wg0 instance has also one outbound connection which resolves a dynamic hostname. My clients don't have any issues with the dns resolution of my wg-server, as I can just restart the service on my clients, reboot them or whatever. Usually it immediately starts to work, once I rest wireguard on the server. The issue I observed most recently was, that according to the wg-status-page in opnsense it was listening on a port 33### while the tunnel configuraton shows port 51820.

Quote from: meyergru on January 21, 2026, 09:41:56 AMI wrote the stale connection job and of course, it only does half of the job...
Thanks Uwe for contributing to the project - didn't realize you were the author of this little helper :-)

QuoteIn your scenario, with a road warrior, the cron job on the "server" side does not help at all, because the road warrior peer has to initiate a new connection. If he fails to notice the stale connection, it will never restart and thus the DynDNS update will go unnoticed.
Yes I fully get that, and that is not my issue tbh, since I just could restart a road-warrior-client while travelling. However restarting the server without a "dial-in" option is not possible. More on my analysis below.

QuoteLuckily, for M-Net, there is a fix to that: They usually do not change their IPv6 prefix on reconnection, much unlike the IPv4. Thus, if you use IPv6 only or IPv6 and IPv4 in your DynDNS, it will effectively work as if you have a fixed IP(v6). In that case, your roadwarrior clients will regain contact within the same Wireguard session.
I know they say they only change the IPv6 prefix after a set amount of "offline" time. A simple restart of my opnsense VM, only lasting <60sec triggers a new IPv6 prefix. But that's also fine, since reconnects don't happen that often...

QuoteCall me any time if you have questions, you have my number!
Yeah we can have a chat on the weekend, I'm very busy until friday (business trip).


As mentioned above, I did a bit more tinkering and thinking last night. As you all have mentioned, the road-warriors shouldn't have an issue connecting to the server as they have their own DNS-resolution. However the problem was not that the clients couldn't resolve my server (packets were arriving, checked with packed capture) but nothig was sent back. As described earlier, the wg-status page reported wg to run on port 33### which is not what is configured and I don't know where it pulled that port from. After restarting the wg-tunnel on the server everything worked fine, clients were online immediatly.

So I went ahead and restarted my opnSense VM (multiple times) - the result was always the same: wireguard worked fine for the roadwarriors, the other site2site box also connected after some offline time and with the help of the stale cron job. So, problem solved? Not quite.

It still bugged me why the wg service "failed" after the reboot a couple of days ago... Then it occured to me that I rebooted the opnSense VM because of issues with my DNS resolution. DHCP offers opnSense as DNS, and Unbound on opnSense uses my local pihole as an upstream host. I initially thought Unbound failed (which it does once a year or so) but the restart did NOT fix my DNS issue this time. It actually came down to pihole needing an update and restart.

So opnsense was rebooted without having a working DNS reachable, as opnSense exclusively uses my pihole for all its queries.

Since my local wg0 instance also has an outbound site2site connection which uses a dynamic hostname as the target it was not able to resolve that hostname after the system booted. In some other thread I read, that if wg DNS fails during startup the service might get stuck. Without having tested this further, I beleive that this is what happened to me. The hung service also does not accept any inbound connections, as the peers were not shown on the status page and wg0 was listening on the wrong port. When I'm back home I'll simulate this with a missing DNS server after rebooting opnSense and see if I run into the same issue.

TL;DR
- Clients never were an issue, they can easily be restarted to "update" any outdated hostnames, but server was not responding
- opnSense was rebooted without any DNS available, maybe wg didn't fully initialize, which als didn't allow clients to connect
- Stale connection script was successfully tested with my site2site connection, after running the script, the connection came back up.