OPNsense Forum

Archive => 22.7 Legacy Series => Topic started by: manilx on December 03, 2022, 11:19:45 pm

Title: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 03, 2022, 11:19:45 pm
I have been running 22.7.8 (and all before) without issues for 9 months. Did a lot of version upgrades. Even from 22.6 to 22.7

Now with the last update I did on Friday I find that after 24hrs or so the internet dies and I see the gateway status "offline". First I thought it was the ISP router or cables. Lost 1 hour checking all. BUT the issue was OPNsense because after a reboot all OK again.
Now again after 24hrs out of the blue the same happened again.

These seems to be an issue with 22.7.9 which I'm at a loss to explain or check.

I have reverted to 22.7.8 (running on proxmox) and did a fresh upgrade again.
Let's see.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: slackadelic on December 03, 2022, 11:33:45 pm
FYI, there's an older post that this is being discussed in already here: https://forum.opnsense.org/index.php?topic=31322.0
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 04, 2022, 11:47:41 am
Hi

I don't think this is a problem with Unbound.

The Gateway turned "red" and my router couldn't be pinged.

I also rebooted after the upgrade. And also after the problem appeared the 1st time and it reappeared.

Must be something different!
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 04, 2022, 12:11:28 pm
https://www.reddit.com/r/opnsense/comments/zbt3il/after_2279_update_the_gateway_suddenly_dies_after/

Posted here to. And there seems to be an issue with the last update!

Suricata may be to blame. I didn't check if it was running when the gateway was lost....
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: peltz on December 04, 2022, 12:53:23 pm
I had the same issue here, firewall (Shuttle DH270) suddenly unreachable after update to 22.7.9 after ~1 hour.
Not pingable on either interface, but could be rebooted with power button (no monitor, keyboard attached)
After reboot, everything worked normally, but froze again after ~30min

Reverted the kernel -> no success
Reverted base, opnsense and suricata -> stable for 24+ hours

Strongly suspect suricata, but there is nothing to be found in the logs
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Wirrkopf on December 04, 2022, 02:24:14 pm
It has to do with Suricata.

As soon as I put a lot of load on my opnsense 22.7.9 box, the interface which I use starts to stop responding to pings, etc. I have another interface on my opnsense box and that is still working. When I restart the suricata service, the ping replies start working again.

I have tried this by using a few speediest-cli calls in parallel and that will bring the relevant interface to a halt.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 04, 2022, 05:41:22 pm
This problem is definitely related to Suricata after Opnsense 22.7.9 upgrade.  I can freeze my Opnsense box (all interfaces drop offline, web GUI freezes and requires hardware reboot with power button), immediately after saturating the line (1Gbps lines) with nzbget (TLS/SSL).  No issues with Suricata service stopped.  Suricata is configured in promiscuous mode, ips enabled, monitoring LAN interface.  This configuration has worked flawlessly for at least a year, previous to this upgrade.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 04, 2022, 06:21:16 pm
I have now locked Suricata at v6.0.8 and upgraded as suggested in https://www.reddit.com/r/opnsense/comments/zbt3il/after_2279_update_the_gateway_suddenly_dies_after/

Will report if it keeps stable.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 04, 2022, 10:40:02 pm
Our issues are the same. Im running Suricata as well. For some reason on mine, unbound is the first victim and so thats what I was focused on.
I ran a speedtest (I have 250/250) and got 250 down, up 85, and it actually quit before it finished the upload test. I restarted suricata and unbound and everything is working again.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 04, 2022, 11:28:08 pm
Hope @franco or someone is reading this ans already fixing......
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Patrick M. Hausen on December 04, 2022, 11:33:58 pm
Hope @franco or someone is reading this ans already fixing......
Shouldn't that be up to the Suricata folks to fix? And best reported in the appropriate subforum?
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 04, 2022, 11:49:52 pm
As I don't know if this is the problem, simple user here, I hope someone with more experience will take this up....
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: cookiemonster on December 05, 2022, 12:03:30 am
As I don't know if this is the problem, simple user here, I hope someone with more experience will take this up....
Can you try to get digging in logs for clues?. If indeed there's a problem with Suricata, which this thread has no proof of yet, as pmhausen wrote, it is not for the OPN devs to try to reproduce your setup by guessing.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 05, 2022, 12:53:16 am
Looks like on reddit the problem was solved by not updating suricata or by having to restart it

I have upgraded again but blocked suricata from doing so. Waiting for the issue to not appear again. If it does I can look at the logs (which ones).
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 05, 2022, 01:01:40 am
Can confirm it's Suricata 6.0.9.  Have spent many hours the last two days testing numerous settings and scenarios.

Reverted to Suricata 6.0.8 on Opnsense 22.7.9 and the problem stopped.  The logs did not show anything other than this: "/usr/local/etc/rc.linkup: DEVD: Ethernet detached event for dynamic wan(em0)" Each time it happened.  Problem was easily reproduced with nzbget (will saturate download pipeline; seems to be related to multiple, parallel high bandwidth connections occurring simultaneously; saw no unusual problems during normal daily network activity, so I'm sure most users will not notice anything amiss).  Dropped the entire network within seconds.

Protectli box, intel NIC, i5, 16GB dual channel.  Suricata running IPS, Promiscuous, on LAN.  Platform and config have been rock solid until this upgrade.

And, there are no hardware problems with the NIC, cable, ISP modem or switch.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 05, 2022, 05:07:34 am
Yes I downgraded to Suricata 6.0.8 (keeping the rest of the system 22.7.9) and its been fine.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: armdn on December 05, 2022, 08:55:59 am
Same problem here. And what im found:
1) It is clearly IPS mode when enabled, it drops all outbound traffic on WAN, but inbound traffic is still go;
2) If disable IPS mode - then everything is goes as expected.

So Suricata 6.0.9 IPS mode starts killing whole traffic from inside to outside.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: guenti_r on December 05, 2022, 09:14:47 am
Some 22.7.9 Boxes, no issues so far.
But these boxes (and most of our 22.10) are very powerful (8 core Xeons and so on).
One "slow" DC690 made "Problems" because Suricata goes wrong.
So i reverted to 22.7.8, let Suricata do all tasks (takes some time), disable Suricata (wait again) and do the Upgrade.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 05, 2022, 12:44:28 pm
Can confirm it's Suricata 6.0.9.  Have spent many hours the last two days testing numerous settings and scenarios.

Reverted to Suricata 6.0.8 on Opnsense 22.7.9 and the problem stopped.  The logs did not show anything other than this: "/usr/local/etc/rc.linkup: DEVD: Ethernet detached event for dynamic wan(em0)" Each time it happened.  Problem was easily reproduced with nzbget (will saturate download pipeline; seems to be related to multiple, parallel high bandwidth connections occurring simultaneously; saw no unusual problems during normal daily network activity, so I'm sure most users will not notice anything amiss).  Dropped the entire network within seconds.

Protectli box, intel NIC, i5, 16GB dual channel.  Suricata running IPS, Promiscuous, on LAN.  Platform and config have been rock solid until this upgrade.

And, there are no hardware problems with the NIC, cable, ISP modem or switch.
Thx for chiming in and confirming!!!
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 05, 2022, 11:50:20 pm
After more than 24hrs I can confirm that this workaround (not updating suricata) fixed the issue for me also.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: slackadelic on December 06, 2022, 07:13:53 am
I'm confused, as I don't use suricata and it's disabled, how would this affect me even though I don't use it?
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on December 06, 2022, 08:39:49 am
To get a bit of straight-forward info into this thread:

# opnsense-revert -r 22.7.8 suricata

Suricata .0.9 introduced a backport set from version 7, but deviated from both the way netmap(4) works on version 6 and 7. It's supposed to offer a different netmap(4) mode but I'm not sure how well it was carried out or how the build decides which one to use.

Suffice to say I wasn't a fan of the unnecessary complications in the stable version (as version 7 isn't out yet) and I voiced my concern to the Suricata team in person when we met at Suricon last month.

We might be able to pull the patch from our 6.0.9 release, but in any case the outlook for version 7 is still indicating that a bug exists, possibly in FreeBSD source code, that locks up the packet flow at some undefined point in time.


Cheers,
Franco
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on December 06, 2022, 10:56:06 am
Reported here https://redmine.openinfosecfoundation.org/issues/5744
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on December 06, 2022, 03:05:20 pm
Can someone with OpenSSL try the following? A modified 6.0.9 without the netmap API patch present:

# opnsense-revert -z suricata


Cheers,
Franco
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 06, 2022, 04:27:46 pm
Doesn't kill the entire network anymore (or crash the Opnsense router), but still kills the SSL connections after 30 seconds or so.  Throwing the ethernet detached event on LAN (em1) now, but interface recovers.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 06, 2022, 04:28:26 pm
Much thanks Franco for the followup.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 07, 2022, 01:49:45 am
I put my .02 in on the bug report, Thanks.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on December 07, 2022, 08:40:09 am
Much appreciated. Mine still seems to be running fine here and I've also been hammering speed tests. :/

I want to know which NIC driver(s) are reproducible. Can you post the output of:

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml


Thanks,
Franco
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: XabiX on December 07, 2022, 11:02:20 am
Thanks all, I had the same issue and disabled IPS for now.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 07, 2022, 12:15:41 pm
Code: [Select]
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: vtnet0
    copy-iface: vtnet0^
  - interface: vtnet0^
    copy-iface: vtnet0

Running inside proxmox VM with virtio
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 07, 2022, 02:52:25 pm
Code: [Select]
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: em1
    copy-iface: em1^
  - interface: em1^
    copy-iface: em1

Baremetal i5, 16GB dual channel RAM, Intel 82583V NIC, Protectli box.  Suricata IPS, Promiscuous, on LAN.

I believe the issue manifests itself the easiest with multiple TLS connections.  As mentioned previously, can reproduce problem instantly with nzbget (docker instance on NAS) (setup 6 news-servers with 63 TLS connections on a 1Gbps connection and watch your interface get obliterated; this setup can completely saturate the line with TLS connections/SSL traffic).  My network performs flawlessly with 6.0.9 otherwise, and same setup works 100% flawlessly on 6.0.8.  Can switch the versions (6.0.9 patched without new API; previous version bricked the box) mid-download and watch the download speeds die and comeback (with restarting the Suricata service in between).
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 07, 2022, 03:07:50 pm
This is very much what's going on:

https://redmine.openinfosecfoundation.org/issues/5744#note-16

I came across this previous problem, as well, during my research, and it describes exactly what is happening with my system.  There are no errors logged.  Interface just drops.  "Ethernet detached event".  And, with the first version of 6.0.9, completely bricked my box and could not be accessed through ssh or web GUI.  Full reboot with hardware power button was only option.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 07, 2022, 09:58:42 pm
I updated 4hrs ago to hotfix (no reboot). Now suddenly internet dies. Gateway was OK.
Services were all up.
I restarted suricata but didn't help.

Didn't have time to check logs etc because I needed internet urgently. As such I just rebooted and again all OK after that.

Reverted to previous 22.7.9 with suricata 6.0.8_1 which has been stable. I had remote backups failing from our company and can't be in this "test" state. Thought that hotfix was "stable".
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 07, 2022, 11:08:35 pm
Much appreciated. Mine still seems to be running fine here and I've also been hammering speed tests. :/

I want to know which NIC driver(s) are reproducible. Can you post the output of:

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml


Thanks,
Franco
Code: [Select]
/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ ->                                                                                           eth0
    #copy-iface: eth3
  - interface: xn1
    copy-iface: xn1^
  - interface: xn1^
    copy-iface: xn1
Running as a Xen HVM guest on Linux 5.15.80
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on December 09, 2022, 10:12:13 am
We want to make sure that we get all the data so I'm extending the previous a bit:

# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
# dmesg | grep generic_netmap_register


Thanks,
Franco
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 09, 2022, 10:19:27 am
Code: [Select]
# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: vtnet0
    copy-iface: vtnet0^
  - interface: vtnet0^
    copy-iface: vtnet0

# dmesg | grep generic_netmap_register
-> no output
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 10, 2022, 01:24:41 am
Code: [Select]
# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: xn1
    copy-iface: xn1^
  - interface: xn1^
    copy-iface: xn1
# dmesg | grep generic_netmap_register
310.301436 [ 320] generic_netmap_register   Emulated adapter for wg0 activated
310.302155 [ 320] generic_netmap_register   Emulated adapter for xn0 activated
476.574947 [ 320] generic_netmap_register   Emulated adapter for xn1 activated
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: peltz on December 10, 2022, 10:52:44 am
Code: [Select]
# grep -e -.interface: -e copy-iface: -e netmap: /usr/local/etc/suricata/suricata.yaml
  - interface: eth0
    #copy-iface: eth1
  - interface: default
  - interface: default
netmap:
  - interface: default
    # (e.g. "copy-iface: eth0+"). Don't forget to set up a symmetrical eth0+ -> eth0
    #copy-iface: eth3
  - interface: igb0
    copy-iface: igb0^
  - interface: igb0^
    copy-iface: igb0

# dmesg | grep generic_netmap_register

<EMPTY>
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 10, 2022, 10:19:07 pm
I updated to 22.7.9_3 from 22.7.8. Unbound will not start now. Tried rebooting multiple times. Everything looks like it is running, but I know I will hit a wall because Unbound no long runs.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 10, 2022, 10:22:39 pm
No output on dmesg.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 10, 2022, 10:36:10 pm
Most pages are now inaccessible. Can't get any firmware pages to show up. Think I will try to go back to 27.7.8
This update is a disaster.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 10, 2022, 10:57:40 pm
Very strange, I disabled Suricata, rebooted and after 5 minutes Unbound came back.
Do I need to revert back to Suricata 6.0.8. I tried opnsense-revert suricata. Just hangs.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 10, 2022, 11:06:04 pm
Yes.

# opnsense-revert -r 22.7.8 suricata
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 10, 2022, 11:12:27 pm
opnsense-revert -r 22.7.8 suricata runs for a couple minutes, then fails with timed out

Then tried changing repository and all I get are dots. after 10 minutes on third row of dots.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 11, 2022, 12:16:10 am
You can download it directly from here:

https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/MINT/22.7.8/OpenSSL/All/

Scroll down and find:

   suricata-6.0.8_1.pkg   2022-11-16 21:31   1.9M   

https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/MINT/22.7.8/OpenSSL/All/suricata-6.0.8_1.pkg

I'm assuming you're on OpenSSL.  Easy enough to figure out, if you're on LibreSSL.

https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/MINT/22.7.8/LibreSSL/All/suricata-6.0.8_1.pkg
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 11, 2022, 12:44:20 am
I had the same fault with Unbound and the web page becoming unresponsive. In fact I initially thought the problem was with unbound and I didnt know about the suricata thing until later. I still dont understand what was causing that.
Downgrading Suricata as above should fix. It did for me.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 11, 2022, 10:03:35 pm
I had to goto my backup hardware, install and update OPNsense, restore configuration, then did the opnsense-revert suricata, which worked. Everything seems ok now using Suricata 6.0.8_1. This site is the only one I have using IDS. All the others updated without incident. I will not do another update on this system until I know the Suricata issues are resolved. Probably a good idea to look at the forum after a new release comes out. My bad.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on December 12, 2022, 02:33:21 am
Make sure to lock Suricata at 6.0.8_1


Sent from my iPhone using Tapatalk
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on December 12, 2022, 10:17:14 am
Both 6.0.8_1 or 6.0.9_1 should work. I don't know about disaster but the backport efforts were not really necessary in my opinion but someone did ask for it and it wasn't OPNsense. ;)


Cheers,
Franco
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 12, 2022, 10:21:46 am
I have installed the patch but blocked Suricata at 6.0.8_1. Stable
I will now update Suricata also but will do a snapshot before to be able to revert easily. Will report how it goes.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 13, 2022, 02:35:07 am
Yes this problem caused me to change my upgrading behavior to performing snapshots of the VHD before hand so that I may simply roll back the VHD to the time before the upgrade.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 13, 2022, 06:29:52 pm
After 24hrs+ running: 22.7.9_3 + suricata    6.0.9_1

Stable so far!
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 14, 2022, 10:02:08 am
So how do you take snapshots. Is that a plug-in? I have been looking for a way to do bare metal backups of OPNsense.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 14, 2022, 10:04:20 am
OPNsense is running in a proxmox VM, where you make snapshots. Nothing to do with OPnsense.

Baremetal: reinstall and restore previously saved config.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 14, 2022, 10:26:06 am
Ok, thanks. Thought I was missing something.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Patrick M. Hausen on December 14, 2022, 10:47:05 am
Install with ZFS - voila! Snapshots!
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 15, 2022, 11:18:09 pm
I installed with ZFS. So the snapshots are automatic? If so, how do you reinstall using snapshot. In TrueNAS, I can rollback or clone. How would you do that with OPNsense?

I wouldn't mind attempting an update again to 22.7.9_3 if I knew I could easily rollback. And just using the config backup doesn't always work. Updating to 22.7.9_3 gave me file issues. Changing the config.xml doesn't change back the OPNsense version. I had to do a fresh install of 22.7, then update to 22.7.8 then apply the config restore. Twice.

I had two main issues with 22.7.9_3. Suricata had issues, and the GUI was very sluggish. Even reverting to Suricata 6.0.8_1 had issues. Took 5+ minutes to load the firmware status and other pages were very slow to load. 22.7.8 is quick and works. I wish the logs were helpful, but nothing to see there in trying to resolve the issues. Must have something to do with the configurations. On two other system with same hardware, the upgrade to 22.7.9_3 went fine. But those don't use IDS and have mostly a default configuration with one LAN and one WAN and only a few NAT rules added. The system that has the issue has 2 WAN's, Mutiple Gateways, 4 LAN's, and complex rules.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Patrick M. Hausen on December 16, 2022, 07:28:20 am
I installed with ZFS. So the snapshots are automatic? If so, how do you reinstall using snapshot. In TrueNAS, I can rollback or clone. How would you do that with OPNsense?
No, they are not automatic.

https://forum.opnsense.org/index.php?topic=25540.msg122731
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: dcol on December 16, 2022, 04:26:55 pm
Thanks. That is what I was looking for. Worked like a charm.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 26, 2022, 03:27:39 pm
@franco

UPDATE:

I have been running 22.7.10_2 and the previous update with the Suricata 6.0.9_1

I have found that for the last few days I get to a time when I see memory usage going up, swap space being used (I have 12GB assigned and pratically never use swap) and then suddenly some domains start not resolving.
Then even more stop resolving. A reboot fixes this.
I have switched from Unbound to DNSmasq but the same happens after a day or so.
I have reverted to OPNsense 22.7.8 have locked the suricata at 6.0.8_1 and updated again to 22.7.10_2

Running fine now again, with normal memory usage and all domains resolving.

So Suricata 6.0.9_1 DOES have issues still........

Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: comet on December 27, 2022, 12:03:44 am
Does this problem affect people who are not running Suricata, I ask because I thought I had read where someone said it does?
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: LiFE1688 on December 27, 2022, 07:33:05 am
Are you using Proxmox and running OPNsense in a VM?
If you are, please disable memory ballooning.
You can find the settings in:
OPNsense VM
Hardware -> Memory
Tick Advanced
Untick Ballooning Device

It seems that FreeBSD 13.1 and MangoDB does not like Proxmox using Ballooning memory and will exhibit modules in OPNsense failing than crashing the whole VM.

@franco

UPDATE:

I have been running 22.7.10_2 and the previous update with the Suricata 6.0.9_1

I have found that for the last few days I get to a time when I see memory usage going up, swap space being used (I have 12GB assigned and pratically never use swap) and then suddenly some domains start not resolving.
Then even more stop resolving. A reboot fixes this.
I have switched from Unbound to DNSmasq but the same happens after a day or so.
I have reverted to OPNsense 22.7.8 have locked the suricata at 6.0.8_1 and updated again to 22.7.10_2

Running fine now again, with normal memory usage and all domains resolving.

So Suricata 6.0.9_1 DOES have issues still........
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 27, 2022, 09:40:43 am
Are you using Proxmox and running OPNsense in a VM?
If you are, please disable memory ballooning.
You can find the settings in:
OPNsense VM
Hardware -> Memory
Tick Advanced
Untick Ballooning Device

It seems that FreeBSD 13.1 and MangoDB does not like Proxmox using Ballooning memory and will exhibit modules in OPNsense failing than crashing the whole VM.


Yes it's on Proxmox. Ballooning was always off, this is not it.

AND I spoke too soon: suricata 6.0.8_1 and OPNsense 22.7.10_2 after a bit more than a day had the same issue: memory trippled and DNS no longer resolving. All services running fine from what I could see.
So there seems to be another issue here apart from suricata. Unbound (also has a bigger update)?

I have reverted to a snapshot from Dec 2nd, were all was fine for ages. Running 22.7.9 with the plugins from that release and see how it goes.
Title: Re: after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: LiFE1688 on December 27, 2022, 03:13:40 pm
Hope you find out what is wrong with it, for me it was ballooning memory. OPNsense and pfSense would both hang after a few hours, I don't use unbound so I have that disabled. I am using either pihole or adguard as my DNS.

After disabling, didn't need to downgrade Suricata and/or ntopng. So I am lucky on that. I do have that annoying constant non-stop

"/usr/local/opnsense/scripts/routes/gateway_status.php: plugins_run return_gateways_status (execute task : dpinger_status())"

thing in the logs in OPNsense. Other than being annoying, it doesn't have anything bad happening yet.

Yes it's on Proxmox. Ballooning was always off, this is not it.

AND I spoke too soon: suricata 6.0.8_1 and OPNsense 22.7.10_2 after a bit more than a day had the same issue: memory trippled and DNS no longer resolving. All services running fine from what I could see.
So there seems to be another issue here apart from suricata. Unbound (also has a bigger update)?

I have reverted to a snapshot from Dec 2nd, were all was fine for ages. Running 22.7.9 with the plugins from that release and see how it goes.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 27, 2022, 03:17:29 pm
It started stopping dns resolving in less than an hour now....

Resorted in starting VM from scratch and restoring backup and reconfiguring the rest.....

What a nightmare this has been since I updated start of this month!
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: Colt45 on December 30, 2022, 06:37:54 am
I have seen the memory balloon issue. I havent been able to figure it out other than its Suricata. I ended up increasing memory to 16GB as like you saw with 12GB, it was pushing onto swap.
Funnily, once I increased memory about a week ago I havent had any issue with memory, in fact its not gone higher than about 4GB. Its very strange. I wonder if I went back to 12GB it would ballon again.
I never had issue with DNS outside of using the original Suricata 6.0.9 when it would stall.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on December 30, 2022, 11:51:30 am
After the reinstall and restore of configs I couldn't get my Zerotier no longer working. While it connected and all seemed well the static routes (defined in the ZT network) to reach my LAN didn't work. 100% equal config as before (uninstalled reinstalled ZT and configured all from scratch to try).
This tipped me over the edge. Spent one day and have now pfsense running with the same configuration as OPNsense (pfblockerng instead of opnsense way of doing).
Running fine!

Will have this running now as main fw for a week and switch back to see if OPNsense stopped having the above issues with resolving DNS.

I NEED a failsafe backup (just switch cables) I can't have the issues as above and all backups from my company failing........

Time will tell with whom I'll stay ;)

Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: xsigndll on January 02, 2023, 03:57:42 pm
Did that issue resolve with update to 22.7.10 ?
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: comet on January 03, 2023, 12:38:28 am
I too would like to know if this has been resolved.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: comet on January 05, 2023, 01:08:56 am
Doesn't anyone have any idea if this has been fixed yet, or whether it even affects people who are not running Suricata?  Those of us who are temporarily away from home are afraid to upgrade our routers for fear of them losing contact with the Internet after a day or so.

OPNsense has always been so reliable and problem free, and now this!
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on January 05, 2023, 08:47:50 am
I’ve spent weeks with an unreliable system from one day to the other. No clear solutions.
Set up from scratch restoring a backup (obviously) but the backup could only be restored with all packages up to date (and reintroducing the possible issue)…..
As I’ve said I switched to pfsense for now and like what I have found.
There’s always a solution.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: gurpal2000 on January 05, 2023, 07:34:43 pm
Noticed internet access was very slow since doing the upgrade couple days ago.

Thankfully came across this thread. Not an opnsense expert. Running opnsense on a dedicated physical machine.

Rolled back to 22.7.9 and things seem to back to 'normal'.

opnsense-revert -r 22.7.9 opnsense
opnsense-update -kr 22.7.9
# then reboot

Cheers,
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on January 05, 2023, 07:47:32 pm
22.7.9 was the last one I used with NO issues at all. Good times....
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: SolarCzar on January 05, 2023, 11:00:56 pm
I have been combing the forum and reddit to understand what was occurring to me.  Just came across you guy's post...thanks.  I too was fine before I updated to 22.7.10.  Looking at how to go back now.  Thanks
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: SolarCzar on January 05, 2023, 11:19:38 pm
Noticed internet access was very slow since doing the upgrade couple days ago.

Thankfully came across this thread. Not an opnsense expert. Running opnsense on a dedicated physical machine.

Rolled back to 22.7.9 and things seem to back to 'normal'.

opnsense-revert -r 22.7.9 opnsense
opnsense-update -kr 22.7.9
# then reboot

Cheers,

So I didn't get the same results after I ran the update...

# opnsense-update -kr 22.7.9
Nothing to do.
root@OPNsense:~ # reboot

Came back up as 22.7.10_2.  What did I miss?
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on January 05, 2023, 11:29:04 pm
You have to use revert.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on January 05, 2023, 11:32:31 pm
And to answer a previous poster, no, it’s not fixed.  Lock Suricata at 6.0.8.  Everything else should work fine.

You can follow the Suricata bug report Franco linked to earlier.


Sent from my iPhone using Tapatalk
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on January 05, 2023, 11:34:46 pm
And, yes, .10, broke somethings for sure, but OPNsense has been a rock solid platform for me for several years up until this last change.


Sent from my iPhone using Tapatalk
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: SolarCzar on January 06, 2023, 12:15:19 am
ran revert to 22.7.9_3 and now my WAN port is going up/down every 5 min.  Crap!
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on January 06, 2023, 12:40:35 am
Noticed internet access was very slow since doing the upgrade couple days ago.

Thankfully came across this thread. Not an opnsense expert. Running opnsense on a dedicated physical machine.

Rolled back to 22.7.9 and things seem to back to 'normal'.

opnsense-revert -r 22.7.9 opnsense
opnsense-update -kr 22.7.9
# then reboot

Cheers,
Do the commands exactly as shown above.  Revert, update, then reboot.  First command reverts the package, second command updates the kernel to the specific package.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on January 06, 2023, 12:44:02 am
And restore your last good config on .9.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: comet on January 06, 2023, 05:07:35 am
And to answer a previous poster, no, it’s not fixed.  Lock Suricata at 6.0.8.  Everything else should work fine.
Thanks for the response.  Still running 22.7.8 here so don't have to revert anything, and I'm not using Suricata but I know it is installed on the router because I see it update when I update OPNsense.  So what I'm trying to clarify is, if I lock my current version of Suricata can I then upgrade to the current OPNsense version without issues, or am I still likely to have problems?  In other words is it definitely just Suricata that is causing all the weirdness and the loss of Internet connectivity, or are there other factors at play also?

And assuming you can do normal upgrades once you have locked Suricata, then my question is, how do you lock Suricata at the current version?  EDIT:  After more searching I found the post on Reddit (https://old.reddit.com/r/opnsense/comments/zbt3il/after_2279_update_the_gateway_suddenly_dies_after/iyw43n0/) that gave this procedure:

Quote
System - Firmware - Packages

Scroll down to Suricata. On the far right there's a lock icon. Clicking it toggles locked/unlocked.

Is that the procedure you used to lock it?

As I said previously, OPNsense has been such a joy to use up until this, but I cannot take the risk of doing an update and having the router lose internet connectivity (whether immediately or a day later).  I would really hate having to go to pfsense or some other router software after all this time, so I hope a solution that permanently fixes this problem appears soon!
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: chknpikr on January 06, 2023, 02:58:22 pm
That is the procedure.

I’m running fine on .10 with Suricata locked at .8, but I would hold at the version you’re on now.  There’s no need to rush the upgrade.  I suspect there are other underlying gremlins complicating various configs, judging from forum posts here and elsewhere.

In tech, “upgrading” a perfectly stable setup, usually turns into, “ruining your entire weekend”.  We’re all masochists.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: SolarCzar on January 07, 2023, 01:55:35 pm
Amen to that last post!  My last two weekends have been toast.  I tried reverting to 22.7.9 and it made my WAN Flapping worse.  I'm thinking about going to 22.7.8, but I'm also outside my wheelhouse of expertise and fearful that I will make the problem worse
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on January 09, 2023, 01:56:22 pm
This thread is a bit of a mess, I'm sorry to say.

Despite claims here the only change in 22.7.9 was Suricata 6.0.9 which we reverted in 22.7.9_2 / 6.0.9_1.

The issue being triggered is INHERENT to the system configuration, hardware (or VM) being used and FreeBSD kernel. So far it looks like 6.0.9 just triggers it more often than not and I think that poking at it will just make it worse if you insist on using IPS with the hardware (or VM) at hand.

Please DO NOT spread oversimplified statements about a particular OPNsense version worse than another one, because it is always a SINGLE component in the release notes causing this behaviour on YOUR system, NOT everyone's system.


Cheers,
Franco
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on January 09, 2023, 02:42:51 pm
Also, fact: up until one specific update all worked for many updates and major versions before. IPS etc.
And then ONE update breaks all in a way that the basic usage is completely compromised.
That's a fact and no finger pointing and blaming......
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on January 09, 2023, 02:49:09 pm
I think that argument is a bit spurious, the all-or-nothing seems to indicate that this is not a specific issue with the update. Reverts, even partial have to work in order for your theory to be confirmed. So as soon as you can pinpoint I'm happy to look at it as always.  :)

The free support advice is to disable IPS or replace hardware, whichever is more convenient. Complaining to others likely isn't as convenient as it looks at first glance.

If there is another issue with Suricata 6.0.9 it's good to look into it, but rest assured that we cannot do QA for other projects on a large scale and hold back stable updates forever, especially with security updates intermingled by the respective authors.


Cheers,
Franco
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: manilx on January 09, 2023, 02:58:46 pm
Again: I'm looking (was looking) at it from an end user perspective. If my car breaks down after an intervention I can't know if it's Bosch or Siemens part of if BMW screwed up. I just want a car working as before. Might be a stupid example I know.
And if the update broke my system I blame the update not the parts it's made of.
You might be technically right but that's not what I meant (as OP). I had "really" big issues and spent many days trying to fix it, without success.
I found a "workaround" and will see what the next big version update brings.

No hard feelings anyway.
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: franco on January 09, 2023, 03:10:55 pm
> And if the update broke my system I blame the update not the parts it's made of.

I know your intention but the bottom line is the culprit was fixed in 22.7.9_2 and so was 22.7.10. Whatever you see on those versions after a clean reboot with a passed health audit is what was the case in 22.7 install and FreeBSD 13.1 by extension.

That's the reason why I'm asking which component in which version the complaint is about and wether you have switched IPS mode off to confirm the problem goes away.


Cheers,
Franco
Title: Re: [update] after 22.7.9 update the gateway suddenly dies after 1 day or so
Post by: gurpal2000 on January 16, 2023, 03:03:05 pm
@franco - appreciate your comments.

Perhaps if you can put a final statement (fix/advice) and maybe lock the thread before it causes further confusion (for want of a better word).

thanks