[update] after 22.7.9 update the gateway suddenly dies after 1 day or so

Started by manilx, December 03, 2022, 11:19:45 PM

Previous topic - Next topic
Yes I downgraded to Suricata 6.0.8 (keeping the rest of the system 22.7.9) and its been fine.

Same problem here. And what im found:
1) It is clearly IPS mode when enabled, it drops all outbound traffic on WAN, but inbound traffic is still go;
2) If disable IPS mode - then everything is goes as expected.

So Suricata 6.0.9 IPS mode starts killing whole traffic from inside to outside.

Some 22.7.9 Boxes, no issues so far.
But these boxes (and most of our 22.10) are very powerful (8 core Xeons and so on).
One "slow" DC690 made "Problems" because Suricata goes wrong.
So i reverted to 22.7.8, let Suricata do all tasks (takes some time), disable Suricata (wait again) and do the Upgrade.

Quote from: chknpikr on December 05, 2022, 01:01:40 AM
Can confirm it's Suricata 6.0.9.  Have spent many hours the last two days testing numerous settings and scenarios.

Reverted to Suricata 6.0.8 on Opnsense 22.7.9 and the problem stopped.  The logs did not show anything other than this: "/usr/local/etc/rc.linkup: DEVD: Ethernet detached event for dynamic wan(em0)" Each time it happened.  Problem was easily reproduced with nzbget (will saturate download pipeline; seems to be related to multiple, parallel high bandwidth connections occurring simultaneously; saw no unusual problems during normal daily network activity, so I'm sure most users will not notice anything amiss).  Dropped the entire network within seconds.

Protectli box, intel NIC, i5, 16GB dual channel.  Suricata running IPS, Promiscuous, on LAN.  Platform and config have been rock solid until this upgrade.

And, there are no hardware problems with the NIC, cable, ISP modem or switch.
Thx for chiming in and confirming!!!

After more than 24hrs I can confirm that this workaround (not updating suricata) fixed the issue for me also.

I'm confused, as I don't use suricata and it's disabled, how would this affect me even though I don't use it?

To get a bit of straight-forward info into this thread:

# opnsense-revert -r 22.7.8 suricata

Suricata .0.9 introduced a backport set from version 7, but deviated from both the way netmap(4) works on version 6 and 7. It's supposed to offer a different netmap(4) mode but I'm not sure how well it was carried out or how the build decides which one to use.

Suffice to say I wasn't a fan of the unnecessary complications in the stable version (as version 7 isn't out yet) and I voiced my concern to the Suricata team in person when we met at Suricon last month.

We might be able to pull the patch from our 6.0.9 release, but in any case the outlook for version 7 is still indicating that a bug exists, possibly in FreeBSD source code, that locks up the packet flow at some undefined point in time.


Cheers,
Franco


Can someone with OpenSSL try the following? A modified 6.0.9 without the netmap API patch present:

# opnsense-revert -z suricata


Cheers,
Franco

Doesn't kill the entire network anymore (or crash the Opnsense router), but still kills the SSL connections after 30 seconds or so.  Throwing the ethernet detached event on LAN (em1) now, but interface recovers.



Much appreciated. Mine still seems to be running fine here and I've also been hammering speed tests. :/

I want to know which NIC driver(s) are reproducible. Can you post the output of:

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml


Thanks,
Franco

Thanks all, I had the same issue and disabled IPS for now.

  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: vtnet0
    copy-iface: vtnet0^
  - interface: vtnet0^
    copy-iface: vtnet0


Running inside proxmox VM with virtio