Untagged Traffic ceased on LAN with 24.7.10_2

Started by tops4u, December 09, 2024, 08:04:59 AM

Previous topic - Next topic
I upgraded vom 24.7.7 to 24.7.10_2 yesterday evening and let it run, since our Internet connection was a bit slow yesterday evening.

This Morning my familiy complained about broken connectivity. Tried to login to my OpnSense installation, but failed. Finally I could login over VPN / Mobile.

I could not figure out what the source of the Problem was. Symptoms: No untagged VLAN Traffic over the LAN Interface (igc0), however all tagged traffic worked like a charm over the same interface. No Config change by the Update nor on the Switch on the other Side. It is definitively not a Firewall issue, since there was no traffic arriving on the LAN Interface.

Since I did not know how to fix, I reverted the installation to 24.7.9 and the Kernel to 24.7.8. Again everything works as expected.

Interfaces that do not have mixed traffic (Tagged and Untagged) worked flawless.

I have some more info on this.

Tagged Traffic on LAN Interface -> WAN worked (ie. my Solar Converters on VLAN reached Internet)
Untagged Traffic on LAN Interface -> WAN did not work (ie. Browser on LAN did not reach Internet)
Untagged Traffic on LAN Interface -> Tagged LAN worked (ie. Browser reached Shelly on restricted Net)
Untagged Traffic on LAN Interface -> LAN untagged did not work (ie. Browser to NAS did not work)
Tagged Traffic on LAN Interface -> LAN untagged did not work (ie. Shelly Actions to NAS Docker Image)

Physical Link was UP all the time. DHCP did sometimes work, and sometimes not  (Probably the ACK was not received) according to the Managed Switch that is directly connected to OpenSense on LAN Interface. NTP could not be synced.

December 09, 2024, 01:34:10 PM #2 Last Edit: December 09, 2024, 01:40:04 PM by vigeland
I have the same problem.

Can you ping the router? Not me, only briefly during the boot phase. Do you use wiregard ?

Ping got lost , when one of this logs where displayed
>>> Invoking start script 'freebsd'
setup igb0_vlan5
setup igb1 [egress only]
Starting flowd_aggregate.

setup igb0_vlan5, only for one of the six Vlans. No setup igb0_vlan4 while booting

No I could not Ping anything at all. Luckily I have had a VPN that I could use to reach the GUI.

As of now, at least 3 People reported this Problem when using 24.7.10 and having an Interface that has tagged and untagged VLAN Traffic. Reverting to 24.7.8 Kernel with 24.7.9 Base did solve the Problem.

I'm no expert in Kernels, but maybe the latest Bugfix introduced this Problem?


I have problem to get access to my UniFi ports, and a really need to lock my playstation5 to one of them for the speed, now it jumps between the the ports and when the playstation get connected to the port one floor up the internet gets very slow. Can somebody help with this, this problem started arter the new system update

If this was all working before then there's very likely a new issue here, but just in case-

I have seen multiple posts from senior members warning against mixing tagged and untagged traffic in FreeBSD.  Is this one of the symptoms of that?

This is one of the very first threads I came across when first setting up my home network: https://forum.opnsense.org/index.php?topic=36530.0

There's a warning in the OPNsense docs against mixing: https://docs.opnsense.org/manual/how-tos/vlan_and_lagg.html

To better understand the reasons and potential issues, I asked ChatGPT and am providing the conversation below for reference.

If this is indeed a common issue then I'm surprised OPNsense doesn't have a built-in warning in the application, as most people probably only go to the docs when they get stuck and are likely to miss it.  There's nothing that prevents people from setting things up like this and not realizing that there could be routing problems.

------

Prompt:
QuoteWe are discussing networking in FreeBSD, particularly firewall systems based on pf such as OPNsense and pfSense.

There are some opinions that tagged and untagged traffic should not be mixed on the same interface.  Why?  What is the technical explanation and what problems does it cause?

Answer by ChatGPT 4o mini, 17 December 2024:
Quote
In FreeBSD, particularly with firewall systems like pfSense and OPNsense, the issue of mixing tagged and untagged traffic on the same interface arises when dealing with VLANs (Virtual Local Area Networks) and standard Ethernet traffic.

Explanation of Tagged and Untagged Traffic

- Tagged traffic refers to Ethernet frames that include a VLAN tag (also known as a 802.1Q tag). This tag carries information about which VLAN the frame belongs to, allowing a network switch or other network devices to properly segregate traffic into different logical networks (VLANs) over the same physical interface.
- Untagged traffic, on the other hand, does not include any VLAN tag. It is treated as being part of the default VLAN (usually VLAN 1 on most switches), which is typically used for normal, unsegmented network traffic.

Why Should Tagged and Untagged Traffic Not Be Mixed?

The technical issue with mixing tagged and untagged traffic on the same interface involves how network devices process Ethernet frames. Here are some of the key reasons:

1. Ambiguity in Frame Processing
   - If both tagged and untagged traffic is sent through the same interface, it can cause ambiguity in how frames are handled. When an Ethernet frame arrives at the interface, the device needs to know if the frame should be processed as part of a specific VLAN (tagged) or as part of the default network (untagged). This confusion can lead to misrouting or mishandling of traffic.
   - For example, if the interface receives a frame without a VLAN tag (untagged) but is also configured to handle tagged traffic for several VLANs, it might not be clear which VLAN the untagged frame should belong to, potentially causing traffic to be misdirected.

2. Security Risks and Traffic Leakage
   - Mixing tagged and untagged traffic on the same interface can lead to security vulnerabilities. If traffic that is not meant to be part of a VLAN is accidentally tagged or handled as part of another VLAN, sensitive traffic could be exposed to devices that should not have access to it.
   - Additionally, malicious users or misconfigured devices might be able to inject tagged traffic into an untagged VLAN, leading to VLAN hopping, where traffic is inadvertently allowed to access networks it shouldn't.

3. Operational Complexity
   - Having both tagged and untagged traffic on the same interface complicates network administration. It can be more difficult to troubleshoot network issues because traffic might behave unpredictably. Misconfigurations are harder to detect when the network doesn't clearly separate VLAN-tagged and untagged traffic.
   - Network configurations such as DHCP (Dynamic Host Configuration Protocol) can also become more complicated. If an interface serves both tagged and untagged traffic, it may be difficult for DHCP servers to assign IP addresses properly to hosts in different VLANs, leading to IP address conflicts or other networking issues.

4. Performance Considerations
   - Mixing tagged and untagged traffic can also lead to inefficiencies in processing, especially in larger networks. The network stack must continuously check if traffic is tagged, and this overhead can become problematic, particularly in high-traffic environments. Furthermore, switches and routers may handle this situation less efficiently when performing VLAN tagging or untagging operations.

Specific Problems that Can Occur
   - Misrouting of Traffic: A frame may be incorrectly assigned to the wrong VLAN, leading to communication issues and potentially leaking traffic to unintended segments of the network.
   - Broadcast Traffic: Broadcast traffic, typically sent untagged, could be inadvertently forwarded to the wrong VLAN if the interface is improperly configured.
   - DHCP Problems: If a device requests an IP address on an untagged VLAN but the interface is processing multiple VLANs, the DHCP process may fail to assign the correct IP, as the VLAN tag information may not be properly applied or recognized.
   - VLAN Hopping: In some cases, a misconfigured interface could allow tagged traffic from one VLAN to enter another VLAN, bypassing segmentation and causing unauthorized access.

Best Practices
To avoid these issues, it is generally recommended to separate tagged and untagged traffic on different interfaces. This can be done by:
   - Using different physical interfaces for tagged and untagged traffic.
   - Using VLAN-aware interfaces where tagged traffic is explicitly separated from untagged traffic through proper VLAN configuration.
   - Ensuring that the network device's configuration clearly distinguishes between traffic that should be tagged and that which should remain untagged.

In summary, mixing tagged and untagged traffic on the same interface can lead to ambiguity, security risks, and operational complications, making it a practice that should generally be avoided in network design. Separating tagged and untagged traffic allows for clearer traffic handling, better security, and simpler network management.


Quote from: tops4u on December 09, 2024, 08:04:59 AMI upgraded vom 24.7.7 to 24.7.10_2 yesterday evening and let it run, since our Internet connection was a bit slow yesterday evening.

This Morning my familiy complained about broken connectivity. Tried to login to my OpnSense installation, but failed. Finally I could login over VPN / Mobile.

I could not figure out what the source of the Problem was. Symptoms: No untagged VLAN Traffic over the LAN Interface (igc0), however all tagged traffic worked like a charm over the same interface. No Config change by the Update nor on the Switch on the other Side. It is definitively not a Firewall issue, since there was no traffic arriving on the LAN Interface.

Since I did not know how to fix, I reverted the installation to 24.7.9 and the Kernel to 24.7.8. Again everything works as expected.

Interfaces that do not have mixed traffic (Tagged and Untagged) worked flawless.

@tops4u have you learned anything more about this condition?  I have a similar problem, and also fixed with the same reversion you used.  But I'd like to know when it's safe to attempt update again.

January 04, 2025, 09:28:21 AM #8 Last Edit: January 04, 2025, 09:30:36 AM by vigeland
I now think that it's not just only a tag/untag problem. My first attempt to update from 7.9 to 7.10_2 ended with the LAN interface no longer being accessible. Well, I had one untagged and 4 tagged interfaces on same hardware interface. Now I have reworked this and there is no more untagged. I now have VLAN 2,3,5,6 and 10 on my LAN interfaces. Everything runs with 24.7.9_1-amd64. I have now carried out the update. After the restart, exactly the same. During the boot phase you can briefly ping the LAN (VLAN 10). After approx. 10-15 pings the connection is lost and the router can no longer be reached via the LAN interface IP (previously untagged, now VLAN 10).   

I use Wiregard and IPsec ?!?

If you can take the downtime, I'd try to characterize the firewall configuration (make sure it boots into the expected configuration), traffic flow (particularly actual entrance interfaces), and ARP table - those were/are my problems. I filter pretty explicitly and have my own default deny rules (per-interface); I also set rule descriptions in the form of "[interface]: [action] [protocol] from [source] to [destination]" (e.g. "TRUST: Pass DNS from TRUST net to Public_v4"), so I can see exactly what rules are being hit in the logs. I can't say if it'll help characterize your issue, but the actual flow path through the firewall may surprise you. Or not. But what you see (or don't see) may be a data point. As for ARP, verify that the entries (MAC and interface) are as you expect, on the firewall and on your connected equipment. A basic element, but unless you're a super-sniffer, ARP issues can be hard to spot.

Good luck.

I have no deny rule on the Lan interface. There are only the standard 2 "allow all" rules ( IPV4 , IPV6 ).
And why does it work for X years with the rules only with the update not. Additionally others have similar problems ?

Quote from: vigeland on January 04, 2025, 05:42:08 PMI have no deny rule on the Lan interface. There are only the standard 2 "allow all" rules ( IPV4 , IPV6 ).
And why does it work for X years with the rules only with the update not. Additionally others have similar problems ?
I've wondered for some time about OPNSense and if it is reliable to work with in all environments.

I'm using it primarily as a VM firewall. In that it seems to be 'not great but works'. 

One culprit is now using an OPNSense VM with a hypervisor bridge which has a physical interface with multiple vlan assigned. 

The hypervisor sets the PVID egress as untagged on the bridge and tagged for the vlans, which is as it is. However, the tagged vlan are visible as untagged inside OPNSense VM. And that's that. No docu pointing out what to do or not to do.

When using multiple vlan-id on a single bridge the only solution seems to be to create a bridge per vlan, which doesn't really make sense, but works.


Quote from: JL on January 06, 2025, 09:41:50 PM
Quote from: vigeland on January 04, 2025, 05:42:08 PMI have no deny rule on the Lan interface. There are only the standard 2 "allow all" rules ( IPV4 , IPV6 ).
And why does it work for X years with the rules only with the update not. Additionally others have similar problems ?
I've wondered for some time about OPNSense and if it is reliable to work with in all environments.

I'm using it primarily as a VM firewall. In that it seems to be 'not great but works'.

One culprit is now using an OPNSense VM with a hypervisor bridge which has a physical interface with multiple vlan assigned.

The hypervisor sets the PVID egress as untagged on the bridge and tagged for the vlans, which is as it is. However, the tagged vlan are visible as untagged inside OPNSense VM. And that's that. No docu pointing out what to do or not to do.

When using multiple vlan-id on a single bridge the only solution seems to be to create a bridge per vlan, which doesn't really make sense, but works.


Quote from: vigeland on January 04, 2025, 05:42:08 PMI have no deny rule on the Lan interface. There are only the standard 2 "allow all" rules ( IPV4 , IPV6 ).
And why does it work for X years with the rules only with the update not. Additionally others have similar problems ?

Ermm. https://docs.opnsense.org/manual/how-tos/vlan_and_lagg.html
The manual for VLANs takes you to this URL which explains not to mix tagged and untagged traffic.
From this thread what I gather is that we have users who have not heeded the advice and have mixed traffic anyways. So far it has "worked" but not after an update. What might have happened is that the update has exposed the misconfiguration only.
p.s. I was one of those users for some time and it was also not giving me any trouble. Until I realised that one day (like with this update - if that is what it is) it was bound to catch me. So I fixed it, by reconfiguring to the recommended setup.

So the ideal course of action is to correct any remaining installations with mixed tagged and untagged traffic in an interface used with OPN, and then move to diagnose any other problems and thinking OPN is "reliable to work with in all environments."

For info:
In my particular case, it was not due to tagged and untagged mix. The problem persisted, even though I only use tagged interfaces. During the update something was changed in wiregard, without wiregard everything works. If I switch back to tagged and untagged, everything also works, just without Wiregard.
I suspect that Wirecard handles the route differently than before.