OPNsense Forum

Archive => 21.1 Legacy Series => Topic started by: juere on April 05, 2021, 02:04:35 pm

Title: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: juere on April 05, 2021, 02:04:35 pm
We are running several larger production installations with a virtual OPNSense "central" gateway (cloud based, high speed internet connection >1Gbit/s) and "branch offices" (DSL based provider internet, typically 100/40Mbit/s) connected via wireguard.
The branch offices are using either OPNSense or OpenWRT hardware routers.
Wireguard uses IPv6 for the tunnel and Dual-Stack inside the tunnel.

Two of theese installations I switched to the new, experimental wireguard-kmod driver coming with 21.1.4 whereever OPNSense was used.
This lead to a substantial increase in downstream rate (from central gateway to branch office, typically factor 1.5 - 2.5) and also a slight increase in upstream rate as expected.
The tunnels are running fine without major issues.

The only problem I can see as far:
This worked fine before, when all the OPNSense gateways used the wireguard-go implementation.
The branch offices using OpenWRT (kernel based wireguard on linux) are not affected and re-connect just fine.

Might be a bug in the wireguard-kmod driver :)
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: mimugmail on April 05, 2021, 06:22:34 pm
But Central is still -go?
What happens when setting short keepalives like 5sec?
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: juere on April 06, 2021, 07:42:26 am
No, central is also wireguard-kmod.
Otherwise there were no significant speed improvements, I also tested this.

Keepalive is set to 25s on the branch-office routers.
As 25s is below the reboot-time of the central gateway as are your suggested 5s, I suspect, this will make no change. But I will test this tonight and tell you what happens.
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: juere on April 06, 2021, 07:37:27 pm
Hmm ...

Set keepalive to 5s for *one* endpoint and rebooted the central gateway.
All 4 branch offices (OPNSense based) came back and reconnected nicely.

Set keepalive back to 25s and tried again.
Same result.

Last thursday I tried the reboot (with 25s keepalive) for three times and each time only the OpenWRT based branch offices reconnected. On the OPNSense based branch offices I had to restart wireguard  :(

I'm a bit clueless now, looks like some strange race condition ...
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: mimugmail on April 06, 2021, 09:02:48 pm
A race would be reproduceable :)
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: SFC on April 06, 2021, 11:04:51 pm
Hmm ...

Set keepalive to 5s for *one* endpoint and rebooted the central gateway.
All 4 branch offices (OPNSense based) came back and reconnected nicely.

Set keepalive back to 25s and tried again.
Same result.

Last thursday I tried the reboot (with 25s keepalive) for three times and each time only the OpenWRT based branch offices reconnected. On the OPNSense based branch offices I had to restart wireguard  :(

I'm a bit clueless now, looks like some strange race condition ...

Can you setup a FreeBSD 13 VM at one of the remotes and see if you get the same results?  The fix will likely need to come from upstream.

Anything in the logs?
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: juere on April 07, 2021, 07:55:37 am
A race would be reproduceable :)

Well, I did three tests each time with the outcome as described.
Yesterday I did shutdown / start cycles of the central OPNSense VM.
Last thursday I did reboots.
Might this make any difference ?

Can you setup a FreeBSD 13 VM at one of the remotes and see if you get the same results?  The fix will likely need to come from upstream. Anything in the logs?

I saw nothing relevant in the logs. There actually is one branch office with a VMWare Server.

I will try to test if rebooting really makes a difference to shutdown/start tonight.
If not, I will set up the FreeBSD 13 VM on saturyday.
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: juere on April 14, 2021, 10:15:04 am
Did some more testing:

Whether I restart or power cycle the "central" OPNsense seem's not to make a real difference.
I typically ended up with some or all "branch" gateways not reconnecting via wireguard unless wireguard is restarted on the branch gateway.
Rarely (2 of 10 attempts) they all came up again.
The OpenWRT based branch routers (3 of 7) always reconnected.

I think, I will fix this with a cron script checking the wireguard tunnel on branch gateways and restarting if necessary :)
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: SFC on April 14, 2021, 02:53:01 pm
Did some more testing:

Whether I restart or power cycle the "central" OPNsense seem's not to make a real difference.
I typically ended up with some or all "branch" gateways not reconnecting via wireguard unless wireguard is restarted on the branch gateway.
Rarely (2 of 10 attempts) they all came up again.
The OpenWRT based branch routers (3 of 7) always reconnected.

I think, I will fix this with a cron script checking the wireguard tunnel on branch gateways and restarting if necessary :)

Two things: did you ever get a chance to test with vanilla FreeBSD?  I think your issue may have been fixed in a patch that just was merged to fix a keepalive timeout.

https://lists.zx2c4.com/pipermail/wireguard/2021-April/006612.html
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: juere on April 14, 2021, 03:18:36 pm
I could install a virtualized FreeSD in parallel to the "central" OPNsense, but not in the branch offices as there currently are no virtualization hosts. Do you think, this makes sense or is a setup with FreeBSD on both sides necessary ?

Can anyone say, if the new 0.0.20210412 wireguard-kmod package will be in OPNsense 21.1.5 ?
Would be much easier to wait and test then :)
Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: SFC on April 14, 2021, 03:38:57 pm
I doubt the WAN link is the issue (although possible) - just setup the FreeBSD host in the main office and treat it like it's a branch, just connect it back to the main instance - see if it reconnects when you bounce the main instance or if it acts like the branch offices.

Title: Re: Testing wireguard-kmod on 21.1.4 / problems with re-connecting
Post by: franco on April 14, 2021, 07:32:37 pm
I could be wrong, but I did not see thread-relevant commits to wireguard-kmod update landing in 21.1.5.


Cheers,
Franco