[update] after 22.7.9 update the gateway suddenly dies after 1 day or so

Started by manilx, December 03, 2022, 11:19:45 PM

Previous topic - Next topic
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: em1
    copy-iface: em1^
  - interface: em1^
    copy-iface: em1


Baremetal i5, 16GB dual channel RAM, Intel 82583V NIC, Protectli box.  Suricata IPS, Promiscuous, on LAN.

I believe the issue manifests itself the easiest with multiple TLS connections.  As mentioned previously, can reproduce problem instantly with nzbget (docker instance on NAS) (setup 6 news-servers with 63 TLS connections on a 1Gbps connection and watch your interface get obliterated; this setup can completely saturate the line with TLS connections/SSL traffic).  My network performs flawlessly with 6.0.9 otherwise, and same setup works 100% flawlessly on 6.0.8.  Can switch the versions (6.0.9 patched without new API; previous version bricked the box) mid-download and watch the download speeds die and comeback (with restarting the Suricata service in between).

This is very much what's going on:

https://redmine.openinfosecfoundation.org/issues/5744#note-16

I came across this previous problem, as well, during my research, and it describes exactly what is happening with my system.  There are no errors logged.  Interface just drops.  "Ethernet detached event".  And, with the first version of 6.0.9, completely bricked my box and could not be accessed through ssh or web GUI.  Full reboot with hardware power button was only option.

I updated 4hrs ago to hotfix (no reboot). Now suddenly internet dies. Gateway was OK.
Services were all up.
I restarted suricata but didn't help.

Didn't have time to check logs etc because I needed internet urgently. As such I just rebooted and again all OK after that.

Reverted to previous 22.7.9 with suricata 6.0.8_1 which has been stable. I had remote backups failing from our company and can't be in this "test" state. Thought that hotfix was "stable".

Quote from: franco on December 07, 2022, 08:40:09 AM
Much appreciated. Mine still seems to be running fine here and I've also been hammering speed tests. :/

I want to know which NIC driver(s) are reproducible. Can you post the output of:

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml


Thanks,
Franco
/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ ->                                                                                           eth0
    #copy-iface: eth3
  - interface: xn1
    copy-iface: xn1^
  - interface: xn1^
    copy-iface: xn1

Running as a Xen HVM guest on Linux 5.15.80

We want to make sure that we get all the data so I'm extending the previous a bit:

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
# dmesg | grep generic_netmap_register


Thanks,
Franco

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: vtnet0
    copy-iface: vtnet0^
  - interface: vtnet0^
    copy-iface: vtnet0

# dmesg | grep generic_netmap_register
-> no output

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: xn1
    copy-iface: xn1^
  - interface: xn1^
    copy-iface: xn1
# dmesg | grep generic_netmap_register
310.301436 [ 320] generic_netmap_register   Emulated adapter for wg0 activated
310.302155 [ 320] generic_netmap_register   Emulated adapter for xn0 activated
476.574947 [ 320] generic_netmap_register   Emulated adapter for xn1 activated


# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: igb0
    copy-iface: igb0^
  - interface: igb0^
    copy-iface: igb0

# dmesg | grep generic_netmap_register

<EMPTY>

I updated to 22.7.9_3 from 22.7.8. Unbound will not start now. Tried rebooting multiple times. Everything looks like it is running, but I know I will hit a wall because Unbound no long runs.


Most pages are now inaccessible. Can't get any firmware pages to show up. Think I will try to go back to 27.7.8
This update is a disaster.

Very strange, I disabled Suricata, rebooted and after 5 minutes Unbound came back.
Do I need to revert back to Suricata 6.0.8. I tried opnsense-revert suricata. Just hangs.


opnsense-revert -r 22.7.8 suricata runs for a couple minutes, then fails with timed out

Then tried changing repository and all I get are dots. after 10 minutes on third row of dots.

You can download it directly from here:

https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/MINT/22.7.8/OpenSSL/All/

Scroll down and find:

   suricata-6.0.8_1.pkg   2022-11-16 21:31   1.9M   

https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/MINT/22.7.8/OpenSSL/All/suricata-6.0.8_1.pkg

I'm assuming you're on OpenSSL.  Easy enough to figure out, if you're on LibreSSL.

https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/MINT/22.7.8/LibreSSL/All/suricata-6.0.8_1.pkg