Hi there,
since some time I noticed a strange behaviour:
On every reboot WireGuard does not start up correctly - even the log claims it does. None of the Tunnels are working. 100% reproducible.
When I then have to dis-activate and re-activate Wireguard once -> working stable until next reboot.
The WireGuard log does not give any clue, everything looks usual.
Any hints?
Are you using WG to establish outbound tunnels from OPNsense or is OPNsense providing WG for other systens to "dial in"?
If the former, are you using host names (FQDNs) for the peers? Can you use IP addresses instead?
It is configured both ways. Which ever side is first establishes the tunnel.
Using the IP for the clients is not an option sue to dynamic IPs.
Have some WG tunnels on OPNsense, only one install, one tunnel does not come back after reboot since some weeks.
Mostly I have to obtain a fresh WAN IP to make this tunnel come back. Yesterday this tunnel went down at 14:08 without any obvious reason, Only wayto make it re-connect was to obtain a fresh IP for WAN.
Makes no sense at all. Devs will say: Wrong config, will break some day. But worked fine for years, have other tunnels configured same way, rock-solid and comming back after each and every reboot.
Did some more testing:
changing from FDQN to IP does not change anything.
But I made one additional observation:
After reboot, I get log-entries like
/usr/local/opnsense/scripts/wireguard/wg-service-control.php: The command </usr/bin/wg syncconf 'wg3' '/usr/local/etc/wireguard/wg3.conf'> returned exit code 1 and the output was "Name does not resolve: `xx.yyy.de:51820' Configuration parsing error"
I would be fine with a comment like "wrong config" if someone could tell me what is wrong all the sudden and how to correct ...
Can you confirm if for you it is the peer or the gateway that is marked offline?
> Devs will say: Wrong config, will break some day.
No, but what we're saying is:
> "Name does not resolve: `xx.yyy.de:51820'
That's a DNS error. No dev can reasonable resolve xx.yyy.de for you.
Cheers,
Franco
OK, x tunnels, all WG with DynDNS, of which x-1 come up normally after opnsense reboot, while 1 tunnel doesn't, restarting WG instance, dis-/enabling does nothing to bring the tunnel up, as does any operation you can imagine on the remote WG instance.
But obtaining a fresh WAN IP on opnsense brings up this tunnel.
All the tunnels up and running for 5+ years, problem started some weeks ago.
What to make out of this?
As written somewhere else: This very same tunnel went down at a random time of day recently, again only a fresh WAN IP could recover the tunnel. No other operation.
You can try the support offering we have which fits this problem scope. Assuming everyone suffers from a low quality integration/implementation isn't helpful. It's also not how this integration/implementation is going to improve without constructing the actual problem in a way it makes sense with the code at hand.
Cheers,
Franco
I started seeing the same exact issue as you that started happening on all my routers. Some regression was introduced in the previous release (26.1.6) or the one before that (26.1.5). I don't know which one it was because 26.1.5 didn't require a reboot so that could have been the culprit but I don't know because I didn't reboot. I only had to reboot after applying 26.1.6, and it started happening on all my routers after that reboot.
https://forum.opnsense.org/index.php?topic=51578.msg
Other people have reported it as well:
https://forum.opnsense.org/index.php?topic=50748.msg
https://forum.opnsense.org/index.php?topic=32232.msg
And if you do a search, there's many, many, more older posts with the same problem.
I'm not sure what changed in the last month or two, but I have never changed my configuration or had this issue up until I applied two updates and then rebooted.
So the issue is either a regression in 26.1.5 or 26.1.6
All I can see from these reports is that DNS resolution isn't working when the tunnel is supposed to come up (if the message is actually the error to look into and not a cosmetic oddity as it could be restarted later anyway). A reboot is also the most likely point DNS resolution isn't up and running yet or is worst case routed through a tunnel that isn't up yet.
Cheers,
Franco
Quote from: franco on April 21, 2026, 03:36:19 PMAll I can see from these reports is that DNS resolution isn't working when the tunnel is supposed to come up (if the message is actually the error to look into and not a cosmetic oddity as it could be restarted later anyway). A reboot is also the most likely point DNS resolution isn't up and running yet or is worst case routed through a tunnel that isn't up yet.
Cheers,
Franco
I agree that is definitely the issue here. I just don't know what changed in the last two versions where this is now a factor and it wasn't before. I never had the issue and I've had the same configuration since 24.x series.
I would love to bisect this issue if there's a way to go change-by-change in the router
Here's the thing about bisecting this... we don't know which DNS server the reporters used how they configured it and which routes they take.
This isn't about a change in e.g. Unbound targeting WireGuard tunnel resolution. And very little changes in our DNS startup sequence. WireGuard starts intentionally late.
But by all means try to bisect this using the core package and opnsense-revert.
Cheers,
Franco
I also suddenly started experiencing this issue when I upgraded from 25.7.11 to 26.1.3. I am using FQDN (not IPs) in the wireguard site-to-site config which was affected.
I knew that using FQDN was risky and not recommended as DNS resolution might not be working when Wireguard tries to start the tunnel. For some reason, it worked for a long time (maybe I was lucky). So, I was lazy and just too the risk until the problem emerged.
As I started experiencing the issues with 26.1.3., I implemented additional post-reboot script to restart Wireguard after DNS resolution is working (basically 60 seconds delay after startup - this gives time for Unbound to start and get going). For my site-to-site use case this is more than sufficient. Has been working without any issues for now. There are many tutorials in the web to explain how to do this.
There's also a DNS renew cron job for WireGuard to use... "Renew DNS for WireGuard on stale connections".
It's even more ironic than you might realise, because it is a Python rewrite of
https://github.com/WireGuard/wireguard-tools/tree/master/contrib/reresolve-dns
"Run this script from cron every thirty seconds or so, and it will ensure
that if, when using a dynamic DNS service, the DNS entry for a hosts
changes, the kernel will get the update to the DNS entry."
So the author of WireGuard recommends attempting to renew your DNS every 30 seconds.
You know why?
Because Kernel doesn't do DNS and all you do when configuring WireGuard (or any other tunnel using ifconfig) using a FQDN/hostname the tool will try to resolve it once and only the IP address is ever stored in the kernel, which makes it impossible for the kernel to renew the DNS.
Cheers,
Franco
Quote from: franco on April 21, 2026, 09:05:09 PMThere's also a DNS renew cron job for WireGuard to use... "Renew DNS for WireGuard on stale connections".
It's even more ironic than you might realise, because it is a Python rewrite of
https://github.com/WireGuard/wireguard-tools/tree/master/contrib/reresolve-dns
"Run this script from cron every thirty seconds or so, and it will ensure
that if, when using a dynamic DNS service, the DNS entry for a hosts
changes, the kernel will get the update to the DNS entry."
So the author of WireGuard recommends attempting to renew your DNS every 30 seconds.
You know why?
Because Kernel doesn't do DNS and all you do when configuring WireGuard (or any other tunnel using ifconfig) using a FQDN/hostname the tool will try to resolve it once and only the IP address is ever stored in the kernel, which makes it impossible for the kernel to renew the DNS.
Cheers,
Franco
In my case, this cron job has been enabled for a long time - and it was already enabled when I observed this change in behavior when upgraded from 25.7.11 to 26.1.3.
So, on 25.7.11:
"Renew DNS for WireGuard on stale connections" --> was enabled and run periodically
No issues when restarting firewall.
On 26.1.3:
"Renew DNS for WireGuard on stale connections" --> was enabled and run periodically
WireGuard did not start on reboot due to DNS resolve issues
--> delayed restart after reboot (60 seconds) --> WireGuard starts as expected.
I was happy enough to get it up and running with delayed restart after reboot so I didn't really investigate this further. However, noticed the change in the behavior although "Renew DNS for WireGuard on stale connections" has been enabled all the time.
Ok fair so what happens when /usr/local/opnsense/scripts/wireguard/reresolve-dns.py is manually invoked without the workaround in place?
Cheers,
Franco
Quote from: franco on April 21, 2026, 09:44:17 PMOk fair so what happens when /usr/local/opnsense/scripts/wireguard/reresolve-dns.py is manually invoked without the workaround in place?
Cheers,
Franco
I will put this on my TODO list for the next maintenance period. All systems are now running and everything is OK but I will investigate this with my test setup when preparing for the next round of updates etc. I will test this before applying any updates just to confirm that we have a clear reference point.
Quote from: franco on April 21, 2026, 09:44:17 PMOk fair so what happens when /usr/local/opnsense/scripts/wireguard/reresolve-dns.py is manually invoked without the workaround in place?
Cheers,
Franco
OK, tested this now on OPNsense 26.1.6-amd64
Wireguard site-to-site setup.
Router A has router B configured as a peer and endpoint address is a FQDN.
Router B has router A configured as a peer and endpoint address is a FQDN.
"Renew DNS for WireGuard on stale connections" --> is enabled and run periodically
--> cron entry: enabled * * * * * "Renew DNS for WireGuard on stale connections"
Step 1: disabled the workaround on both site-to-site routers (A and B)
--> workaround was the delayed restart (60 second delay) of Wireguard after system bootup to ensure that DNS is working when tunnel is restarted.
--> site-to-site tunnel is working at this point
Step 2: restart router A
--> wireguard tunnel starts properly as router A boots up
--> note that router B has not been restarted
Step 3: restart router B
--> wireguard site-to-site tunnel won't start on router B due to DNS resolve issues (Name does not resolve).
--> site-to-site tunnel is down
Step 4: manually invoke /usr/local/opnsense/scripts/wireguard/reresolve-dns.py on router B
--> tunnel is still down
--> no feedback from CLI and no log entry in Wireguard log
Step 5: manually invoke /usr/local/opnsense/scripts/wireguard/reresolve-dns.py on router A
--> tunnel is still down
--> no feedback from CLI and no log entry in Wireguard log
Step 6: restart Wireguard S2S on router A
--> tunnel is still down
Step 7: restart Wireguard S2S on router B
--> tunnel comes up and traffic flows
Any ideas how to debug this further?
Quote from: hfvk on April 30, 2026, 07:01:25 PMQuote from: franco on April 21, 2026, 09:44:17 PMOk fair so what happens when /usr/local/opnsense/scripts/wireguard/reresolve-dns.py is manually invoked without the workaround in place?
Cheers,
Franco
OK, tested this now on OPNsense 26.1.6-amd64
Wireguard site-to-site setup.
Router A has router B configured as a peer and endpoint address is a FQDN.
Router B has router A configured as a peer and endpoint address is a FQDN.
"Renew DNS for WireGuard on stale connections" --> is enabled and run periodically
--> cron entry: enabled * * * * * "Renew DNS for WireGuard on stale connections"
Step 1: disabled the workaround on both site-to-site routers (A and B)
--> workaround was the delayed restart (60 second delay) of Wireguard after system bootup to ensure that DNS is working when tunnel is restarted.
--> site-to-site tunnel is working at this point
Step 2: restart router A
--> wireguard tunnel starts properly as router A boots up
--> note that router B has not been restarted
Step 3: restart router B
--> wireguard site-to-site tunnel won't start on router B due to DNS resolve issues (Name does not resolve).
--> site-to-site tunnel is down
Step 4: manually invoke /usr/local/opnsense/scripts/wireguard/reresolve-dns.py on router B
--> tunnel is still down
--> no feedback from CLI and no log entry in Wireguard log
Step 5: manually invoke /usr/local/opnsense/scripts/wireguard/reresolve-dns.py on router A
--> tunnel is still down
--> no feedback from CLI and no log entry in Wireguard log
Step 6: restart Wireguard S2S on router A
--> tunnel is still down
Step 7: restart Wireguard S2S on router B
--> tunnel comes up and traffic flows
Any ideas how to debug this further?
I just tested this on OPNsense 26.1.7-amd64 as well. The behavior is exactly the same. The only way to get the tunnel online after rourter B restart is either
1) setup a script to perform delayed tunnel restart on router B after startup or
2) manually restart the tunnel on router B