Re-application of firewall rules needed to allow Wireguard routing after reboot

Started by Westie, November 28, 2023, 12:38:29 PM

Previous topic - Next topic
Dear lurkers: I have a workaround here.

Hi all

Unfortunately I've not been able to find any sort of logs that will assist with this issue, however I'll describe the issue in its entirety in the hope that someone can help with regards to how to diagnose this issue.

This is something that came to light when upgrading to the latest community version from 23.1 - at time of this post, it is 23.7.9.

As far as I can tell, everything appeared to be working when I had performed checks post-install of 23.7, however I wouldn't trust that.

So, issue:

  • Firewall VM is booted
  • Wireguard is booted
  • WG user can connect to WG server, and can access OPNsense admin UI
  • WG user can access any IPs that are declared/routed through Wireguard
  • WG user cannot access any IPs that are routed to the firewall itself, other than the port of admin UI/admin SSH
  • Any attempt by WG user to access IPs routed through the firewall are met with a connection timeout
  • Any attempt by WG user to access IPs do appear in logs (filter=dst_ip, filter=dst_port), and they are not blocked.
  • NOTE: I didn't check whether or not any responses to the originating IP address were logged nor blocked.
  • WG user proceeds to "Firewall" > "Aliases" page within admin UI
  • WG user proceeds to click "Apply" at the bottom of the page
  • WG user is now able to access IPs local to the Firewall

I'm confused. How should I go about this, other than adding a script to perhaps re-apply firewall rules soon after boot?

Note: all other routing works as intended, I appear to just have problems being routed through the firewall between the firewall booting, and me manually re-applying rules.

I have this or similar problem too.  My post was here:  https://forum.opnsense.org/index.php?topic=36942.msg180942 

There is a suggestion to diagnose there, but I have not had time to do that yet.

I'm testing code at the moment that would prevent relevant content from being omitted in the /tmp/rules.debug file which is used to generate content such as NAT rules that people have reported to be missing and requiring a local firewall/filter restart to render correctly.

Much of this has to do with overzealous scrubbing of network device configuration in response to dynamic events and network hiccups encountered out in the wild (line going down and up for a brief moment).

There is a test patch here that may improve the situation:

https://github.com/opnsense/core/commit/64e0867a4

# opnsense-patch 64e0867a4

If that doesn't work at first glance please try to diff the /tmp/rules.debug file before and after appying the manual filter fix to see what we are actually missing.


Thanks,
Franco

Patch appears to not help.

Proof of patch being applied:
root@meerkat-firewall:~ # cat /usr/local/www/interfaces_assign.php | grep interface_bring_down | wc -l
       0


Diff of debug file after manual reapply
root@meerkat-firewall:~ # diff /tmp/rules.debug.1 /tmp/rules.debug
85a86,87
> nat on vtnet0 inet from (wg0:network) to any port 500 -> (vtnet0:0) static-port # Automatic outbound rule
> nat on vtnet0 inet from (wg1:network) to any port 500 -> (vtnet0:0) static-port # Automatic outbound rule
89a92,93
> nat on vtnet0 inet from (wg0:network) to any -> (vtnet0:0) port 1024:65535 # Automatic outbound rule
> nat on vtnet0 inet from (wg1:network) to any -> (vtnet0:0) port 1024:65535 # Automatic outbound rule
102a107,110
> rdr on wg0 inet proto {tcp udp} from {any} to {(self)} port {80} -> $traefik port 80
> nat on wg0 inet proto {tcp udp} from (wg0:network) to $traefik port {80} -> (wg0) port 1024:65535
> rdr on wg1 inet proto {tcp udp} from {any} to {(self)} port {80} -> $traefik port 80
> nat on wg1 inet proto {tcp udp} from (wg1:network) to $traefik port {80} -> (wg1) port 1024:65535


There are numerous additional instances of what's been flagged up in 102a107,110 however I've removed them for clarity.

I've just reverted the patch and can confirm that the same diff is obtained.

Thank you for responding quickly by the way @franco, I appreciate the help

Another post!

A friend and I had come up with a suggestion independently and almost at exactly the same time, to use syshooks to resolve the issue.

Contents of /usr/local/etc/rc.syshook.d/start/99-reload-rules:

#!/bin/sh

sh -c "(sleep 60 && /usr/local/etc/rc.filter_configure) > /dev/null" &


Whilst I could probably get away with reducing the timer down to 30 seconds, I kept it at a minute because honestly... if things were to go wrong, 30 seconds would be the least of my issues.

Thanks for taking the time to look into this. Can you briefly tell more about your setup? You probably have your WireGuard instance assigned as an interface. In the interface config are IPv4 and IPv6 mode set to "none"? Do you have a "Tunnel Address" configured in the WireGuard instance? And what's your main WAN connectivity?

Rerunning the filter fixes it but ideally the filter should pick up the right data at the right point in time, not loosely related after. I'll be looking into this with my colleague today.


Cheers,
Franco

Looks like this was it.. missing the same functionality that OpenVPN has:

https://github.com/opnsense/plugins/commit/7b94f91a5f

# opnsense-patch -c plugins 7b94f91a5f


Cheers,
Franco

Oh boy thank you Franco,

I will test the patch as I was affected as well.

WG was configured:
- WG interface created to bind to the WG instance, no IP or additional configuration on the assigned interface
- IP was set on the WG instance
- WAN is based on IP Ethernet, connectivity delivered from Telco via a COAX to a Modem
Telco GW - coax - coax modem - Ethernet - OPNsense


Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Tried to apply the patch, currently fetch not working >

fetch: https://github.com/opnsense/core/commit/7b94f91a5f.patch: Not Found

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Oops, I keep making that mistake with WireGuard almost being in core...

# opnsense-patch -c plugins 7b94f91a5f

Alright I did try it but sadly I am still seeing at least the issue with NAT not being applied for WG.

OPNsense version:  23.7.9-amd64

etched 7b94f91a5f via https://github.com/opnsense/plugins
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|From 7b94f91a5f3c99b907db8cad38e99141ea9f8f3a Mon Sep 17 00:00:00 2001
|From: Franco Fichtner <franco@opnsense.org>
|Date: Wed, 29 Nov 2023 10:30:50 +0100
|Subject: [PATCH] net/wireguard: add a filter reload if something was
| reconfigured
|
|PR: https://forum.opnsense.org/index.php?topic=37248.0
|---
| net/wireguard/Makefile                                        | 2 +-
| net/wireguard/pkg-descr                                       | 2 ++
| .../src/opnsense/scripts/Wireguard/wg-service-control.php     | 4 ++++
| 3 files changed, 7 insertions(+), 1 deletion(-)
|
|diff --git a/net/wireguard/src/opnsense/scripts/Wireguard/wg-service-control.php b/net/wireguard/src/opnsense/scripts/Wireguard/wg-service-control.php
|index 249e6f606a..0e09a98a60 100755
|--- a/net/wireguard/src/opnsense/scripts/Wireguard/wg-service-control.php
|+++ b/net/wireguard/src/opnsense/scripts/Wireguard/wg-service-control.php
--------------------------
Patching file opnsense/scripts/Wireguard/wg-service-control.php using Plan A...
Hunk #1 succeeded at 294.
done
All patches have been applied successfully.  Have a nice day.


Performed a reboot

1:03PM  up 1 min, 1 user, load averages: 0.24, 0.10, 0.04

I did as well comparision, after reapplying the NAT traffic from WG started to reach Internet.

diff /tmp/rules.debug.1 /tmp/rules.debug
120a121
> nat on igc0 inet from (wg1:network) to any port 500 -> (igc0:0) static-port # Automatic outbound rule
128a130
> nat on igc0 inet from (wg1:network) to any -> (igc0:0) port 1024:65535 # Automatic outbound rule


Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD


Yes I do, aka WAN Interface, tracking the 1st HOP aka Telco GW.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD