Traffic routed arbitrarily over the Wireguad Interface despite disabled WG gw

Started by wrobelda, February 23, 2022, 05:37:59 PM

Previous topic - Next topic
(Note that this was originally reported on GitHub, as I suspect this is a bug: https://github.com/opnsense/core/issues/5592)

Describe the bug

I am migrating my setup from pfSense to OPNSense. Everything was OK so far, until Wireguard client VPN migration. I copied my config 1:1 from pfSense, which was a basic "client" connection to a remote VPN provider, accompanied by a selective traffic redirection for one of the LAN hosts. Used this guide and it worked from scratch: https://docs.netgate.com/pfsense/en/latest/recipes/wireguard-client.html

Now, with OPNSense, here's what's happening:

1. With my LAN host redirecting rule enabled, the host is not getting connected to the Web. Checked `wg0` interface on the firewall and seeing monitoring ICMP packets only.
2. After a reboot, somehow *all*  the WAN traffic is routed via `wg0`, although LAN hosts don't get the Internet (probably because of NAT is somehow messed up)
3. Since all WAN traffic is being routed, I naturally assumed that the WG gw must have taken over the WAN one. However:
- WG gw has a lower priority (255) vs the WAN gateway (254)
- WG gw is not marked as upstream, but WAN gw is
4. Still, I disabled the WG gw altogether, yet the traffic *still* shows on the wg0 interface (!), and the LAN hosts are still not connecting to the Internet
5. So I rebooted, and it's still the same (!!!). Gateway is down, all custom firewall rules disabled, yet the WAN traffic still shows on `wg0` :o
6. Only after disabling wg0 interface things actually got back to normal. Rebooting and re-enabling the interface doesn't bork it up, which is a clear indicator that enabling/disabling things (gateways?) is currently not deterministic.

FYI, I can reproduce this each time.

Expected behavior

– Should be able to selectively route the traffic over the Wireguard gateway.
– Gateway should respect the priority and upstream denotation.
– Disabling a gateway should revert back to the next one in priority
– Disabling a gateway should clean up (revert) all changes done to networking configuration

Environment
OPNsense 22.1.1_3-amd64


I checked my netflow and found a similiar behavior. For context: my wireguard server is running but there are no clients connected.
After clicking apply in the wg0 Interface configuration and restarting wireguard things went back to normal.

I dont even have a wireguard Gateway configured.

OPNsense 22.1.1_3-amd64
FreeBSD 13.0-STABLE
OpenSSL 1.1.1m 14 Dec 2021

(Image attached)
i want all services to run with wirespeed and therefore run this dedicated hardware configuration:

AMD Ryzen 7 9700x
ASUS Pro B650M-CT-CSM
64GB DDR5 ECC (2x KSM56E46BD8KM-32HA)
Intel XL710-BM1
Intel i350-T4
2x SSD with ZFS mirror
PiKVM for remote maintenance

private user, no business use


Quote from: seed on February 26, 2022, 03:51:41 PM
I checked my netflow and found a similiar behavior. For context: my wireguard server is running but there are no clients connected.

Mind reporting back on GitHub? The link is at the beginning of my first message. This should be handled as a bug, but is currently labeled as "support" taks.

I just upgraded to 2.1.2 and my LAN hosts lost Internet connectivity. I rebooted once again and noticed it was there for a while before going off within seconds, so I suspected this issue again, and, bingo: despite the WG interface being explicitly disabled in the UI, I can see the wg0 interface is up in the ifconfig and all the WAN traffic is routed via it. Enabling it in the UI and disabling again restored the Internet to LAN hosts.

This is ridiculous.

Can you post screenshots of assigned WG interface, wireguard config (incl. Advanced), Gateways overview and your rules handling the traffic please

Sorry for the delay in replying.

So what was causing the routing issue with LAN clients was the fact that I did not have the "Disable routes" checked in Local Endpoint configuration. I was migrating my config step-by-step from pfSense, where this isn't an option, hence the omission. Once disabled, the WG interface is no longer messing up the regular LAN-WAN traffic. My apologies for allowing myself show my frustration.

However, having followed the https://docs.opnsense.org/manual/how-tos/wireguard-selective-routing.html , I still can't get selective routing to work over WG. I had designated one of the hosts in the LAN to have all its traffic routed via WG, to no avail – the traffic is routed via WAN.

Requested screenshots: https://imgur.com/a/YgzypHL
Note that there aren't any rules in "WG_BULLET" WG interface table nor "WireGuard (Group)"

The setup is:
- WG DNS 10.100.0.1
- WG client address 10.100.1.11/32

From my debugging, it seems that the issue is related to Firewall LAN rules, specifically.

For example, despite the LAN rule being present for the 10.0.0.100 client traffic to 10.100.0.1 (the WG DNS) be forwarded via the WG_BULLIT_GW, issuing dig @10.100.0.1 google.com from the 10.0.0.100 yeilds connection timeout. However, if I add an explicit System Route (System -> Routes -> Configuration) to route 10.100.0.1/32 via the WG_BULLIT_GW, DNS queries work right away, and I can see the relevant traffic in tcpdump -ni wg1 output:

16:49:28.193324 IP 10.100.1.11.20462 > 10.100.0.1.53: 12284+ [1au] A? google.com. (39)

Firewall stats show that neither of the Firewall rules are being evaluated.


Snap :o I knew it must have been something stupid, spent too much time on this today. Thanks, works as expected now.