vlan issues - in combination with IPS (IDS works)

Started by fireburner, January 28, 2022, 10:47:48 PM

Previous topic - Next topic
January 28, 2022, 10:47:48 PM Last Edit: January 31, 2022, 06:58:16 PM by fireburner
First of all, I would like to thank everyone, involved in any way, for this wonderful firewall appliance!

I just upgraded to 22.1.
When booted up, all is normally working for around 3 minutes.
Then on the console, I can On the console I see these messages:
099.460195 [ 849] iflib_netmap_config       txr 2 rxr 2 txd 1024 rxd 1024 rbufsz 2048
igb1: link state changed to DOWN
igb1_vlan40: link state changed to DOWN
igb1_vlan11: link state changed to DOWN
igb1_vlan30: link state changed to DOWN
100.203318 [ 849] iflib_netmap_config       txr 2 rxr 2 txd 1024 rxd 1024 rbufsz 2048
100.510787 [ 849] iflib_netmap_config       txr 2 rxr 2 txd 1024 rxd 1024 rbufsz 2048
100.830802 [ 849] iflib_netmap_config       txr 2 rxr 2 txd 1024 rxd 1024 rbufsz 2048
igb3: link state changed to DOWN
igb3_vlan1: link state changed to DOWN
igb3_vlan40: link state changed to DOWN
igb3_vlan30: link state changed to DOWN
101.016024 [ 849] iflib_netmap_config       txr 2 rxr 2 txd 1024 rxd 1024 rbufsz 2048
101.349329 [ 849] iflib_netmap_config       txr 2 rxr 2 txd 1024 rxd 1024 rbufsz 2048
igb1: link state changed to UP
igb1_vlan40: link state changed to UP
igb1_vlan11: link state changed to UP
igb1_vlan30: link state changed to UP
igb3: link state changed to UP
igb3_vlan1: link state changed to UP
igb3_vlan40: link state changed to UP
igb3_vlan30: link state changed to UP

Afterwards there s no network connection possible anymore via the vlan and dhcp also doesn#t work anymore.

Here is a little description of my systems and network:
OPNsense runs on a pcengines apu4d4:
igb0 is the wan port -> all is fine here
igb2 is bridged with the LAN interface and I get DHCP and network access to the OPNsense box and the internet as expected.
The parent interface for igb1 + igb3 are not used.
Instead 3 vlans for 3 subnets (LAN, GUEST, DMZ) are used.
vlan1 and vlan11 are bridged to the LAN interface; vlan 40 is bridged into GUEST and vlan 30 into DMZ. All 3 subnets have DHCP.

There is a vlan capable switch connected to both igb1 and igb3, which then taggs the vlans do different ports on the switch, so I can connect devices and have them in the respective subnet, where I want them to be.
All was working fine for at least 2 years with this setup.

What is also interesting is, that it shows (in addition to igb1, igb3) the following unassigned interfaces under interfaces -> overview
lo0, enc0, pfsync0, pflog0
and ovpns2, ovpns3 (I only have one OpenVPN server running and interestingly ovpn3 shows up)

I also have "Hardware CRC", "Hardware TSO" and "Hardware LRO" disabled.
I can also see that there are some tunables, which seem not to be supported anymore:
debug.pfftpproxy Disable the pf ftp proxy handler. unsupported unknown
dev.igb.0.eee_disabled Disable Energy Efficiency unsupported 1
dev.igb.1.eee_disabled Disable Energy Efficiency unsupported 1
dev.igb.2.eee_disabled Disable Energy Efficiency unsupported 1
dev.igb.3.eee_disabled Disable Energy Efficiency unsupported 1
hint.acpi_perf.0.disabled [fireburner] tuning CPU boost unsupported 0
hint.acpi_throttle.0.disabled [fireburner] tuning CPU boost unsupported 0
hint.p4tcc.0.disabled [fireburner] tuning CPU boost unsupported 0
hw.igb.0.fc disable flow control unsupported 0
hw.igb.1.fc disable flow control unsupported 0
hw.igb.2.fc disable flow control unsupported 0
hw.igb.3.fc disable flow control unsupported 0
hw.igb.num_queues Set number of queues to number of cores divided by number of ports, 0 lets FreeBSD decide (should be default) unsupported 0
hw.igb.rx_process_limit [fireburner] tuning net bandw unsupported -1
hw.igb.tx_process_limit [fireburner] tuning net bandw unsupported -1
legal.intel_igb.license_ack [fireburner] tuning net bandw unsupported 1


Does anyone have an idea, which changes might have caused issues here?

edit: I have removed all the unsupported tunables and restarted, but the issue remained.
and the full general log from within the GUI:
2022-01-28T23:02:16 Error opnsense /usr/local/etc/rc.newwanip: Failed to detect IP for 3_VLAN_DMZ[opt4]
2022-01-28T23:02:16 Error opnsense /usr/local/etc/rc.newwanip: On (IP address: ) (interface: 3_VLAN_DMZ[opt4]) (real interface: igb3_vlan30).
2022-01-28T23:02:16 Error opnsense /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb3_vlan30'
2022-01-28T23:02:16 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for static opt4(igb3_vlan30)
2022-01-28T23:02:15 Error opnsense /usr/local/etc/rc.newwanip: Failed to detect IP for 3_VLAN_GUEST[opt1]
2022-01-28T23:02:15 Error opnsense /usr/local/etc/rc.newwanip: On (IP address: ) (interface: 3_VLAN_GUEST[opt1]) (real interface: igb3_vlan40).
2022-01-28T23:02:15 Error opnsense /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb3_vlan40'
2022-01-28T23:02:14 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for static opt1(igb3_vlan40)
2022-01-28T23:02:14 Error opnsense /usr/local/etc/rc.newwanip: Failed to detect IP for 3_VLAN_LAN[opt2]
2022-01-28T23:02:14 Error opnsense /usr/local/etc/rc.newwanip: On (IP address: ) (interface: 3_VLAN_LAN[opt2]) (real interface: igb3_vlan1).
2022-01-28T23:02:14 Error opnsense /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb3_vlan1'
2022-01-28T23:02:13 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for static opt2(igb3_vlan1)
2022-01-28T23:02:11 Error opnsense /usr/local/etc/rc.newwanip: Failed to detect IP for 1_VLAN_DMZ[opt8]
2022-01-28T23:02:11 Error opnsense /usr/local/etc/rc.newwanip: On (IP address: ) (interface: 1_VLAN_DMZ[opt8]) (real interface: igb1_vlan30).
2022-01-28T23:02:11 Error opnsense /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb1_vlan30'
2022-01-28T23:02:11 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for static opt8(igb1_vlan30)
2022-01-28T23:02:10 Error opnsense /usr/local/etc/rc.newwanip: Failed to detect IP for 1_VLAN_LAN[opt7]
2022-01-28T23:02:10 Error opnsense /usr/local/etc/rc.newwanip: On (IP address: ) (interface: 1_VLAN_LAN[opt7]) (real interface: igb1_vlan11).
2022-01-28T23:02:10 Error opnsense /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb1_vlan11'
2022-01-28T23:02:10 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for static opt7(igb1_vlan11)
2022-01-28T23:02:09 Error opnsense /usr/local/etc/rc.newwanip: Failed to detect IP for 1_VLAN_GUEST[opt9]
2022-01-28T23:02:09 Error opnsense /usr/local/etc/rc.newwanip: On (IP address: ) (interface: 1_VLAN_GUEST[opt9]) (real interface: igb1_vlan40).
2022-01-28T23:02:09 Error opnsense /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb1_vlan40'
2022-01-28T23:02:08 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for static opt9(igb1_vlan40)
2022-01-28T23:02:07 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for static opt4(igb3_vlan30)
2022-01-28T23:02:06 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for static opt1(igb3_vlan40)
2022-01-28T23:02:05 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for static opt2(igb3_vlan1)
2022-01-28T23:02:02 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for static opt8(igb1_vlan30)
2022-01-28T23:02:01 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for static opt7(igb1_vlan11)
2022-01-28T23:02:00 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for static opt9(igb1_vlan40)


edit2: by checking the boot messages, I found these suspect messages:
igb0: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb0 (ifp 0xfffff800035c8800), ignoring.
igb1: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb1 (ifp 0xfffff800034b8000), ignoring.
igb2: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb2 (ifp 0xfffff800037fe000), ignoring.
igb3: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb3 (ifp 0xfffff80003609800), ignoring.


edit3: I had IDS and IPS turned on. After disabling IDS completely, I can see on the console, that the vlans go down and up again. Then all is working normally again. I will investigate further what is wrong with IDS/IPS and if it only is one of them or if it is the blocklist that needs to be re applied.

Hi,
same problem here.
After upgrade no connection to the firewall over trunk interfaces.
After disabling IPS/IDS everything is working as expected.

Can I provide some information to help ?


Same issue on a fresh install of 22.1. Everything runs fine without intrusion detection turned off. As soon as I enable it, all traffic to the WAN dies.

I finally found time to look into this:

I cleared all lists and started in IDS only mode.
All is working fine, even after 30 minutes.

However as soon as I enable IPS, the described issue is present again.

I assume it could have sth. to do with the "Promiscuous mode" setting, that needs to be enabled for IPS.
However it seems not to cause any issues with active promiscuous mode and IDS.

IDS IPS is active on the physical interfaces or vlans as well?

Activate IPS only on physical interfaces.

IPS is only enabled on the physical interface in my setup.

I ran into the same problem. Took me several reboots, factory resets and turning off services to finally figure out what it was.

I have two WAN interfaces and the parent interface for my VLANs. This worked well before, but broke with 22.1. I will leave IPS/IDS disabled for now.

Same issue here, after Update to 22.1
Cheers,
Crissi

February 05, 2022, 09:36:01 PM #8 Last Edit: February 05, 2022, 10:40:28 PM by allan
I first ran into this issue under 21.7.6. Netmap API version 14 was reverted on 21.7.7 which fixed my issue until the upgrade to 22.1. Since then, I removed all tunables except for the Spectre (hw.ibrs_disable) and Meltdown (vm.pmap.pti) mitigations. I no longer have tunables listed as "unsupported" in the GUI. This did not fix the issue so I have disabled Suricata again.

These messages came up on the console and may be related.

iflib_netmap_config        txr 2 rxr txd 1024 rxd 1024 rbufsz 2048
igb1: permanently premiscuous mode enabled
iflib_netmap_config        txr 2 rxr txd 1024 rxd 1024 rbufsz 2048
igb1: link state changed to DOWN
iflib_netmap_config        txr 2 rxr txd 1024 rxd 1024 rbufsz 2048
iflib_netmap_config        txr 2 rxr txd 1024 rxd 1024 rbufsz 2048
igb2: link state changed to DOWN
iflib_netmap_config        txr 2 rxr txd 1024 rxd 1024 rbufsz 2048
igb2_vlan2: link state changed to DOWN
igb2_vlan3: link state changed to DOWN


Edit: Disregard my comment about iflib_netmap_config. I forgot that is normal.

I have the same issue. I have disabled IDS/IPS all to gather. Nothing i change in the GUI fixes the issue. They have broke the suricata package with the new update. Had similar issue when they release Netmap version 14. I had issue with my fiber 10gig interfaces. The new update with cause my cisco switch 10 gig port to flap which i have increase the flap time and recovery if the port gets disabled. I am not sure why the port has to go down multiple times on and off which cisco switch thinks the port is flapping and disables it.

Had a very stable firewall to very unstable with this new update. :(

Should we open a Github Issue for this ? If yes, in which repo ?

Don't run IPS in VLAN interfaces. Emulated mode didn't work prior to 22.1 in kernel and now it does but it's buggy as hell as per upstream.


Cheers,
Franco

Quote from: franco on February 07, 2022, 12:31:29 PM
Don't run IPS in VLAN interfaces. Emulated mode didn't work prior to 22.1 in kernel and now it does but it's buggy as hell as per upstream.
Thanks Franco, but I did only activate IPS on physical interfaces and not on the vlans.
Or does it mean, that we must not activate it on vlan parent interfaces either?

However I am also wondering, if IDS/IPS does work at all this way (sitting on the parent interface).

So can anyone here confirm that it works when using the parent Interface and enable promisc mode? Works or doesnt still work would be fine :)

Quote from: mimugmail on February 08, 2022, 06:31:34 AM
So can anyone here confirm that it works when using the parent Interface and enable promisc mode? Works or doesnt still work would be fine :)

In 21.7.6 it did not work when running IPS on the parent physical inferface(s) for me.  I need VLANS on both sides (WAN and LAN) on top, but enabled with promiscious mode on the parent WAN only. As I assume it is related to the issues we saw in 21.7.6,  I will wait with the update to 22.1. for a while.