OPNsense Forum

Archive => 24.1, 24.4 Legacy Series => Topic started by: Koloa on April 05, 2024, 12:24:55 AM

Title: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: Koloa on April 05, 2024, 12:24:55 AM
I will do a poor job of explaining this, and my apologies for that in advance.

I was at the most recent version of OPNSense, and everything was working fine.

I updated to 24.1.5_1 this morning, and one of my Wireguard tunnels - whilst up - is not routing/passing packets as it did prior to this version being installed.

The interface (wg3) receives packets from the peer (I can remotely access the peer, and send ICMP or other traffic to the IP address of the wg3 endpoint on my OPNSense box.

However, no outbound packets traverse that interface, despite there having been no changes to my configuration for Wireguard for quite some time.

I have firewall rules in my OPNSense to permit certain hosts on my LAN to send to the remote peer, and am using outbound masquerading to masquerade as the "Interface address" for those packets -- again, nothing has changed here, and it was routing/passing packets just fine until this update.

I've tried a few reboots, just in case that might "clear something up", but to no avail.

On my OPNSense box, listening on the wg3 interface, I can see my remote peer's ICMP coming in:

listening on wg3, link-type NULL (BSD loopback), capture size 262144 bytes
09:18:16.311527 IP 10.200.202.2 > 10.200.202.1: ICMP echo request, id 49148, seq 256, length 64
09:18:17.335410 IP 10.200.202.2 > 10.200.202.1: ICMP echo request, id 49148, seq 257, length 64
09:18:18.359476 IP 10.200.202.2 > 10.200.202.1: ICMP echo request, id 49148, seq 258, length 64
09:18:19.383596 IP 10.200.202.2 > 10.200.202.1: ICMP echo request, id 49148, seq 259, length 64


But nothing is returned from 10.200.202.1, which is the wg3 interface.

WireGuard reports that it is up and handshakes are working fine, which is obviously the case or the ICMP wouldn't make it in.

I have several other WG interfaces, all of which are working fine as far as I can tell (Dashboard says all gateways are up, and packets seem to be flowing).

The only thing that is different about this particular interface (wg3) is that it's only meant to be listening on localhost, which means that the traffic to the remote endpoint needs to go to another process on my OPNSense box which is also listening on localhost.

Again, this was all working great till I updated, now not so much.

If I can't figure out a quick fix for this one, what's the right/best way to "rollback" OPNSense installs to the previous version whilst I ponder this a bit more?

Thank you!
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: Koloa on April 05, 2024, 12:31:48 AM
I did a manual "restart" of the wg3 interface as per:

https://forum.opnsense.org/index.php?topic=39819.0

And that seems to have fixed whatever was "stuck".

And yes, I'm at _2 now.
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: stoege on April 05, 2024, 08:34:08 AM
i've got the same, strange behaviour running multiple WG Clients against an OPNsense Cluster. Since upgrade yesterday to 24.1.5, some Clients can't connect anymore.

What seems still working:
- connecting from the Office with MacOS
- connecting from the Office with OpenBSD
- connecting from Remote with Debian

what doesn't work anymore:
- connecting from Home via another OPNSense
- connecting from Home with MacOS
- connecting from Remote with IPhone

I saw there is an Patch available this morning 24.1.5_2. Didn't not help for me.

*** Update ***
Upgrade to OPNsense 24.1.5_3 and fixing a local routing issue solved all my problems.
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: franco on April 05, 2024, 09:32:50 AM
# opnsense-revert -r 24.1.4 opnsense

I don't think it's the wireguard code...
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: os914964619 on April 05, 2024, 02:54:04 PM
Seeing the same issue here. WG site to site worked fine until I upgraded.

# opnsense-revert -r 24.1.4 opnsense

does not fix it.

I noticed that the OS routing table is not showing any of the allowed IPs I have under the Site to Site peer. I can successfully ping the remote "Wireguard transfer net IP" when I SSH into the router but that's about it.

"Disable routes" is Unchecked and I've rebooted multiple times.

I looked at `/usr/local/etc/wireguard/*.conf` and everything looks good.

Performing a traceroute on the router, I can see that the issue is that because the routes from the "Allowed IPs" are not getting propagated to the OS the packets are exiting the WAN / default gateway interface instead of going to the wireguard interface.

Anything under Wireguard->Peers->Allowed IPs should be showing up when I do a `netstat -rn` on the router, correct? I think there was a regression introduced and that's the bug.
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: MrTux on April 05, 2024, 03:31:02 PM
Hi

I've also issues with Wireguard since I upgraded to 24.1.5x. What I see, with the OPNsense version 24.1.5x, the wireguard doesn't create the routes for the wireguard peers.
So in wireguard peer we have the "Allowed IPs", this sould be added to the routing table, but this isn't the case anymore.
I added the routes manually and then it works again, but this is only a workaround.
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: os914964619 on April 05, 2024, 03:39:10 PM
Routes are no longer being propagated in Wireguard since the update. Statically adding the routes is a workaround, but there is a bug in opnsense:

Configuration:

root@router:~ # cat /usr/local/etc/wireguard/wg1.conf
####################################################
# Interface settings, not used by `wg`             #
# Only used for reference and detection of changes #
# in the configuration                             #
####################################################
# Address =  192.168.10.2/24
# DNS =
# MTU =
# disableroutes = 0
# gateway =

[Interface]
PrivateKey = ...
ListenPort = ...

[Peer]
# friendly_name = ...
PublicKey = ...
PresharedKey = ...
Endpoint = ...:...
AllowedIPs = 192.168.10.1/32,192.168.20.1/24


I can successfully ping the remote "Wireguard transfer net IP"

root@router:~ # ping 192.168.10.1
PING 192.168.10.1 (192.168.10.1): 56 data bytes
64 bytes from 192.168.10.1: icmp_seq=0 ttl=64 time=60.482 ms
64 bytes from 192.168.10.1: icmp_seq=1 ttl=64 time=28.849 ms
^C
--- 192.168.10.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 28.849/44.666/60.482/15.817 ms


I can't ping the remote network

root@router:~ # ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1): 56 data bytes
^C
--- 192.168.20.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss


The remote network is not showing up in the routing table even though its in the "Allowed IPs"

root@router:~ # netstat -rn|grep wg1
192.168.10.0/24    link#15            U           wg1


Manually adding the remote network as a static route to the wireguard interface

root@router:~ # route add -net 192.168.20.1/24 -interface wg1
add net 192.168.20.1: gateway wg1


Confirming it's there

root@router:~ # netstat -rn|grep wg1
192.168.20.0/24    link#15            US          wg1
192.168.10.0/24    link#15            U           wg1


Now I can ping the remote network as it's getting routed through the wireguard interface

root@router:~ # ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1): 56 data bytes
64 bytes from 192.168.20.1: icmp_seq=0 ttl=64 time=29.310 ms
64 bytes from 192.168.20.1: icmp_seq=1 ttl=64 time=30.184 ms
^C
--- 192.168.20.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 29.310/29.747/30.184/0.437 ms


I would love a better workaround for this until it's fixed as I don't want to be manually adding static routes.
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: franco on April 05, 2024, 05:33:59 PM
Quote from: os914964619 on April 05, 2024, 02:54:04 PM
Seeing the same issue here. WG site to site worked fine until I upgraded.

# opnsense-revert -r 24.1.4 opnsense

does not fix it.

I did not say it would. ;)

It's a bit odd, because the route creation and check is pretty straightforward, but the debugging in these recent cases is nowhere near straightforward which can also point to configuration issues that may have existed prior.


Cheers,
Franco
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: os914964619 on April 05, 2024, 05:54:49 PM
I suspect it's related to commits on:

https://github.com/opnsense/core/issues/7304
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: franco on April 05, 2024, 06:04:54 PM
As I said that's a straightforward change and easily verified with a routing table and your configuration.


Cheers,
Franco
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: os914964619 on April 05, 2024, 07:46:03 PM
I figured it out. You were right franco, thanks. It was an invalid configuration that was allowed through the opnsense validator at some point in the past.

For people that have this issue: Check if you've assigned a static ip address to your wireguard interface. You would be able to see this under Interface->[Your wireguard interface].

If you go to this page and press save without making ANY changes, opnsense will yell at you with an error message. Make the fix (in my case, don't assign a static ip address), then press save, apply the changes, and then restart wireguard. The routes will now get propagated.

I feel like the page should not have had "Static IPv4" as a drop down option if it's a wireguard interface.

Is there anything in opnsense that could find invalid configuration entries like that in the future? I did a health check previous to this and that said everything looked good.
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: franco on April 05, 2024, 09:54:19 PM
Happy to hear you found it.

> I feel like the page should not have had "Static IPv4" as a drop down option if it's a wireguard interface.

That's true yet for historic reasons that's what it was capable of and how the page is laid out at the moment. We will change these behaviours when the interface settings page goes to MVC/API, but that will be one of the very last changes in that migration I think.

> Is there anything in opnsense that could find invalid configuration entries like that in the future? I did a health check previous to this and that said everything looked good.

Health check is mostly for file system integrity. And yes, the MVC framework can validate this, if it's actually migrated. See previous point. :)


Cheers,
Franco
Title: Re: 24.1.5: Wiregard routing/masquerading issue? How to rollback?
Post by: Koloa on April 06, 2024, 04:10:55 AM
Quote from: os914964619 on April 05, 2024, 07:46:03 PM

For people that have this issue: Check if you've assigned a static ip address to your wireguard interface. You would be able to see this under Interface->[Your wireguard interface].

If you go to this page and press save without making ANY changes, opnsense will yell at you with an error message. Make the fix (in my case, don't assign a static ip address), then press save, apply the changes, and then restart wireguard. The routes will now get propagated.

In my case, this is not the issue (at least not obviously); routes are not being propagated, as you pointed out, but, it's not due to an issue with the interface configuration.

My interface is set to "None" for the IPv4 and v6 configuration type -- all of my WG interfaces are -- and doing a "save" does not generate an error.

However, the UI *does* say that a change was made and now I need to apply it, even though no change was actually made, which may imply something else changed.

Certainly possible that it's something related to the Github issue mentioned.  My list of AllowedIPs is quite extensive, 168 defined networks or /32 hosts.  I'll go through that carefully and look for any overlaps - I did find a /24 which was also defined in a /16. 

In my situation, though, manually restarting the interface in question from the UI allowed me to route traffic again, but it's still not clear what's "broken".