Firewall pass-all rules not firing

Started by devl_ish, May 30, 2025, 11:55:09 AM

Previous topic - Next topic
Hey, I've hit a brick wall and need some places to look for cracks.

Problem: For testing I've put in pass-all rules on all interfaces, but firewall does not match these rules and so default-denies.

Situation (for completeness sake):

Topsense N100-based device with 4 2.5GBe, on which I've added an m.2 WiFi card for an access point, and USB-ethernet and USB wifi dongles. 
The USB devices are just for maintaining GUI access to proxmox and Opnsense as I experiment with networking, and the WiFi card has successfully been passed through to an OpenWRT VM.
I don't expect any of this will be pertinent to the problem.


I've got a fresh OPNSense VM running with 3 interfaces bridged from ProxMox:

1. igc3WAN - home router to 4th port on router to proxmox bridge vmbr3, OPNsense sees it as vtnet0, DHCP client on OPNSense, gets DHCP lease successfully from my home router at 192.168.50.55, can ping gateway and google DNS when firewall brought down with console "pfctl -d"
2. LANvmbr0 -  bridge on proxmox, USB-ethernet device, OPNsense sees it as vtnet2, my home router issues DHCP lease just fine at 192.168.50.39, can ping gateway and google DNS when firewall brought down with console "pfctl -d".
3. VMNetOpenWRT - bridge on proxmox, no physical device, OPNsense sees it as vtnet10, DNSMasq on OPNsense successfully issued leases to an OpenWRT and two Linux Mint VMs for testing, all can ping each other.

Below is the result from using the web interface on LANvmbr0 (192.168.50.39) and the (temporary) ruleset in place that I think should have prevented it from being denied. At the same time, pinging my home router from console times out.

If I then use pfctl -d from console I can both ping and get to the web gui on this interface, so its clearly the firewall.







I have:

1. Rebooted the router after changes
2. Rebooted the proxmox host
3. Dumped the state table
4. Restarted all services from console

What would make this allow-all rule set work so I can complete and test the network and then slowly add firewall rules?

Ok, weird. I decided to wait for replies on this thread, and in the meantime disable the packet filter with pfctl -d and work on other bits of the config. I went to gateways and found the IP for the active gateway was no longer showing (this was previously automatically added in and I'm certain it was showing as 192.168.50.1). Added that IP in, rebooted and the pass-all seems to work.

I...still don't know why this would present in logs as it did.

Maybe a little diagram would help to understand your network better. If you want your rules to show up in the logs, you need to activate logging for them. The IPv6 pass-all rule is probably not doing what you expect. The direction of a rule is important. Most of the time the direction should be "in".

My best guess is that you may not be looking at the correct interface. That's why turning the packet filter off helps. If I had to debug this situation, I'd add one pass-all floating rule with debugging enabled. This way, one can see in the logs which interface the traffic is coming in.

It's not obvious what your IP subnet is, but assuming a /24, the traffic in question should not even hit the FW...

IIRC, you were also playing with bridges on PVE/OPN.
If that traffic is exercising the bridge members, it looks like another case of filtering at the wrong level (members vs bridge).

Quote from: mooh on May 30, 2025, 12:39:06 PMMaybe a little diagram would help to understand your network better. If you want your rules to show up in the logs, you need to activate logging for them. The IPv6 pass-all rule is probably not doing what you expect. The direction of a rule is important. Most of the time the direction should be "in".

My best guess is that you may not be looking at the correct interface. That's why turning the packet filter off helps. If I had to debug this situation, I'd add one pass-all floating rule with debugging enabled. This way, one can see in the logs which interface the traffic is coming in.

Thanks, I'll keep that in mind when debugging future issues (and have done, with an port forward issue I'm having right now that I hope I'll be able to figure out today rather than posting again).

Quote from: EricPerl on May 30, 2025, 07:45:25 PMIt's not obvious what your IP subnet is, but assuming a /24, the traffic in question should not even hit the FW...

IIRC, you were also playing with bridges on PVE/OPN.
If that traffic is exercising the bridge members, it looks like another case of filtering at the wrong level (members vs bridge).

Thanks - I had the pass-all rules on both the member and bridge and was using /24.

Again, such traffic should go between peers directly.
One possibility is that one of them is misconfigured (e.g. /32 netmask) so that it sends everything to the gateway.

Another possibility is that the 2 peers are connected to different members of the bridge.
A properly configured bridge should not perform any filtering at the member level for this reason.
Connections between peers is regular switching and typically not subject to filtering (which is expected at the interface level).
That's what the tunables in the bridging guide are all about.

Quote from: EricPerl on June 02, 2025, 10:29:32 PMAgain, such traffic should go between peers directly.
One possibility is that one of them is misconfigured (e.g. /32 netmask) so that it sends everything to the gateway.

Another possibility is that the 2 peers are connected to different members of the bridge.
A properly configured bridge should not perform any filtering at the member level for this reason.
Connections between peers is regular switching and typically not subject to filtering (which is expected at the interface level).
That's what the tunables in the bridging guide are all about.

Ah - the 192.168.50.x network is on my home router, and the 192.168.50.10 address is my laptop (which ifconfig shows to be using a /24 netmask). So, my laptop .10 is on the same home router network as OPNsense at .39 which sees it as an upstream WAN. That seems to me as analogous to it being connected to my ISP's ONT at the WAN port and my laptop is like some other client on the internet trying to talk to it, so I'd expect that to be firewalled. I'm not surprised that without that default gateway being set that it didn't work, but am surprised that it presented as a default block by firewall and not just a routing error.

I can't believe how badly I misread that original post.

Both the "WAN" and the "LAN" side of OPN get IP via DHCP from your home router???
That's an invalid setup. You can't have overlapping subnets across interfaces.
OPN's LAN should have its distinct range, statically assigned.

I won't comment on the OpenWRT VM for now.

Quote from: EricPerl on June 03, 2025, 07:28:08 AMI can't believe how badly I misread that original post.

Both the "WAN" and the "LAN" side of OPN get IP via DHCP from your home router???
That's an invalid setup. You can't have overlapping subnets across interfaces.
OPN's LAN should have its distinct range, statically assigned.

I won't comment on the OpenWRT VM for now.

My fault entirely, I think I need to start over. I couldn't get port forwarding working so I took a proxmox snapshot and then "factory reset" the OPNSense instance.

Background: My ultimate goal is to have all home devices (including WiFi clients, IP Cameras, a general purpose linux server (self hosted apps behind a reverse proxy), a NAS and a zigbee gateway) on separate VLANs via a managed switch behind this OPNSense appliance plugged directly into my fibre ONT.

So far, separately, I've successfully experimented with Outbound NAT, VLANs shared by proxmox VMs and physical devices on a managed switch, and DHCP. When I understand each bit I'll pull everything down and build it out properly.
 
The parts I have not yet been able to get working are port forwards and PPPoE via the ONT to my ISP (not yet attempted).

After factory reset on the OPNSense instance I brought up only the following:

1. WAN Interface via proxmox bridge to NIC to home router
2. One VLAN on a separate proxmox bridge attached to a separate NIC, dnsmasq serving DHCP.
3. Test VM on the same proxmox bridge with same VLAN tag, successful DHCP lease from OPNSense.



My intention is to be able to simulate SSH'ing from anywhere to the WAN public IP and have it port forward to another machine. I can currently do this on my home router quite easily. However today I got perpetually stuck with the firewall logs showing redirection, but connection timeout from the ssh client.

That's on the backburner for the moment because the behaviour described in the original post in this thread is back on the "fresh" instance. I have the same pass-all rules in place, but without pfctl -d it blocks all traffic on the WAN interface including the gui from my laptop, and this time the gateway address wasn't missing.



I've once again gone over the steps including deleting and reinstating interfaces and rules, clearing the state table and rebooting all levels including the home router, but can't seem to get the OPNSense instance to once again pass all traffic. Is there anything that comes to mind to check?

June 03, 2025, 01:49:38 PM #9 Last Edit: June 03, 2025, 02:05:10 PM by devl_ish
Ok, spun up another OPNSense VM to do a cold install, disabled the first one.

After install:

1. Set WAN to DHCP -> OK, new IP 192.168.50.67
2. Added vlan50 -> OK
3. Enabled vlan50 interface with static ip 10.2.1.1/24 -> GUI connection drops.
4. Console pfctl -d . Check firewall logs - shows default deny on WAN IP,  port 443 from laptop IP.
5. Delete vlan50 interface via GUI. pfctl -e. No access to GUI.
6. Console pfctl -d. Delete vlan50 device. pfctl -e. No access to GUI.

Might this be a bug? I should be able to reverse changes and have results reverse too, unless creating VLAN devices and assignments makes changes not undone in deletion?

Edit: Tried same process with proxmox bridge with no physical member. Other VM members on this bridge can still talk to each other just fine with static IPs so everything seems working as it should with the bridge itself. Still knocks out the firewall rules.

The diagram is reasonable. You don't need to assign a member NIC to the "LAN" bridge if you only plan to connect VMs.
meyergru has a Proxmox + OPN post in the tutorials. The 2nd post is about a data center use case (1 NIC).
Feel free to continue to use the USB NIC but you don't have to. You can also adapt as appropriate.

Make that Proxmox bridge VLAN aware. All you have to do for the Linux Mint VM is to specify a VLAN tag = 50 in the network device passed to the VM.

Wrt to your issue:
By default, WAN does not allow access to the GUI (no FW rules).
But if it's the only interface, anti-lockout applies to it.
When you're adding another interface, anti-lockout "moves" to that...

Add a FW rule to WAN to allow GUI access.

Quote from: EricPerl on June 03, 2025, 07:17:17 PMAdd a FW rule to WAN to allow GUI access.

And disable reply-to of the management PC is connected to the same network as WAN and the default gateway.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: EricPerl on June 03, 2025, 07:17:17 PMThe diagram is reasonable. You don't need to assign a member NIC to the "LAN" bridge if you only plan to connect VMs.
meyergru has a Proxmox + OPN post in the tutorials. The 2nd post is about a data center use case (1 NIC).
Feel free to continue to use the USB NIC but you don't have to. You can also adapt as appropriate.

Thanks, it took a while to wrap my head around that from tutorials but its been consistently working :-)

Quote from: EricPerl on June 03, 2025, 07:17:17 PMMake that Proxmox bridge VLAN aware. All you have to do for the Linux Mint VM is to specify a VLAN tag = 50 in the network device passed to the VM.


They are, and VLANs are working fine otherwise - on the last VM I also successfully brought up an outbound NAT for internet access, and the Mint VM with a VLAN tag could talk easily with a debian SBC I had plugged in to a managed switch on a tagged port.
Interestingly enough an issue I had two days ago was that I accidentally enabled VLAN awareness in proxmox on the WAN bridge and that took down all OPNSense networking. I reversed that change but nothing I did could get it back. I ended up restoring from snapshot which to my mind shouldn't have worked but it did. I don't have enough records on what I did to troubleshoot that again though...


Quote from: EricPerl on June 03, 2025, 07:17:17 PMWrt to your issue:
By default, WAN does not allow access to the GUI (no FW rules).
But if it's the only interface, anti-lockout applies to it.
When you're adding another interface, anti-lockout "moves" to that...

Add a FW rule to WAN to allow GUI access.

That's the part I'm having difficulty understanding - the pass-all should cover that (and it did on the last VM) but it doesn't fire, it still reaches default-deny. The rule moving to the new interface (and not moving back on delete) explains why the anti-lockout doesn't do its thing anymore, but I can't figure out why the pass-all doesn't trigger.

The new auto rules are showing here, so I tried to specifically replicate the last 3 back on the WAN:


I've substituted (self) for "This Firewall" but can't find a way to replicate "(vtnet0)", the port assigned as [WAN].
No change in behaviour.

Quote from: Patrick M. Hausen on June 03, 2025, 08:17:34 PM
Quote from: EricPerl on June 03, 2025, 07:17:17 PMAdd a FW rule to WAN to allow GUI access.

And disable reply-to of the management PC is connected to the same network as WAN and the default gateway.

Thanks - tried that, unfortunately no change.

Ok, what the actual hell.

I deleted the VLAN and brought up a bridge to a physical port for testing - no change in behaviour.
Then I added the VM back in, reasoning that I could just run the management interface through a VM in the interim.

Didn't need pfctl -d. Yes, for some reason it now works as soon as the next interface was added. I have no explanation for this. pfctl -e tells me the firewall is definitely already back up.

I'll need to understand eventually but for now I'm going to work on that port forward test.


Good call from Patrick. I was time pressed and forgot to mention that one.
It's always sane to do this on an internal OPN. Without it, traffic is bouncing off of the edge router LAN GW...

FWIW, the anti-lockout rule is primarily a port forward rule (with associated FW rule).
What you see on that screen is the associated rule.
On WAN, all you need is one rule to allow HTTPS (you don't really the HTTP one is you disable the HTTP->HTTPS redirect or type the full https:// when you access the GUI.
Note that the rule won't fire if the anti-lockout rule are currently on that interface because PF rules take precedence over FW rules.

How about showing the WAN rule you added?

Going back to the vmbr2 bridge (LAN side of OPN).
A bridge is a switch.
If the member is a physical NIC, it's like a physical port on the switch.
When you assign the bridge to a VM, you get a virtual port on the switch.

The traffic on the bridge is going to be dependent on the interface(s) you assign on the OPN side:
* Assign an interface to vtnet5 directly and traffic is not tagged
* Assign an interface to a vlan device parented to vtnet5 and traffic is tagged

When you have vmbr2 passed to the VM with a VLAN tag, it will only be exposed to traffic from the VLAN (BUT the traffic received by the VM is untagged, as with a regular access port on a switch). You don't have to do any VLAN config within the VM.

If you connected a machine to NIC3:
When the OPN LAN interface is assigned to a vlan device, traffic is tagged (and ignore by that machine unless its NIC is configured appropriately).
When the OPN LAN interface is assigned to vtnet5 directly (no VLAN), traffic is not tagged and should flow freely.