Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - zuppaduppa

#1
Hi everyone,

I'm hitting a wall with a VLAN issue where tagged traffic seems to be processed incorrectly by my OPNsense VM, despite extensive diagnostics showing the Proxmox bridge and physical network are working correctly.

My Setup:

    - Host: Proxmox VE 8.4.14 (Kernel 6.8.12-16-pve).

    - Hardware: CWWK Mini PC (N100/N150 model) with 4x Intel i226-V 2.5GbE NICs.

    - VM: OPNsense 25.7 (VM 100).

    - Network: UniFi Switch (USW Flex) & AP (U6 IW).

    - VLANs: LAN (untagged, Native VLAN 1), IOT (VLAN 100), GUEST (VLAN 200).

Problem: Traffic from my IOT VLAN (e.g., Chromecast, 192.168.100.100) destined for a server on the LAN (192.168.10.5:443) is being processed on my LAN interface instead of my LAN_IOT interface. The OPNsense firewall log shows this traffic being passed by the LAN interface rules, completely bypassing my LAN_IOT rules. This happens with both OPNsense network configurations I've tried (see below).

Troubleshooting & Evidence:

    - Proxmox Bridge (vmbr1): Is "VLAN aware" (bridge-vlan-aware yes) and the config file confirms bridge-vids 2-4094.

    - tcpdump Tests:

        - tcpdump on the physical NIC (enp2s0) shows VLAN 100 tags arriving from the UniFi switch.

        - tcpdump on the bridge (vmbr1) also shows the VLAN 100 tags are present and being passed to the VM.

    - "Smoking Gun" LXC Test:

        - I created a new Alpine LXC (CT 102) on the same host.

        - I gave it two vNICs on vmbr1: net0 (untagged, 192.168.10.250) and net1 (VLAN Tag 100, 192.168.100.250).

        - I successfully pinged both interfaces from my laptop (pinging .100.250 while on the IOT VLAN, pinging .10.250 while on the  LAN).

Conclusion: This proves my Proxmox bridge (vmbr1) is working correctly and is handling tagged/untagged traffic to an LXC perfectly. The problem is isolated to the OPNsense VM (KVM/QEMU) or its interaction with the bridge.

Failed Fixes (The problem persists after all these steps):

    - Architecture 1 (Router-on-a-Stick): OPNsense VM with one VirtIO vNIC (vtnet1 on vmbr1, no tag), OPNsense handles VLANs internally (vlan01, vlan02 parented to vtnet1). -> Result: Leak.

    - Architecture 2 (PVE-handled VLANs): OPNsense VM with separate VirtIO vNICs on vmbr1 (net1 untagged, net4 with tag=100, net5 with tag=200). OPNsense interfaces assigned directly to vtnet1, vtnet4, vtnet5. -> Result: Same leak.

    - Alternative vNIC Drivers: Changing all OPNsense vNICs to E1000 or vmxnet3 causes the OPNsense VM to kernel panic and fail to boot. Only VirtIO boots, but it has this leak.

    - Host/Driver Fixes:

        - Rebooted Proxmox host multiple times.

        - Reset OPNsense state table.

        - Added bridge-mcsnoop 0 to the bridge config.

        - Disabled the Proxmox firewall on all OPNsense vNICs.

        - Disabled EEE (EEE status: disabled) and GRO (ethtool -K enp2s0 gro off) on the host's physical NIC.

    - IPv6: Allow IPv6 is disabled in OPNsense settings, so this is not an IPv6 leak.

I am completely out of ideas. It seems only the VirtIO vNIC boots, but it's not handling the tagged traffic correctly inside OPNsense, even though the bridge is proven to be working. What else could cause this?

Thanks for any help!