[update] after 22.7.9 update the gateway suddenly dies after 1 day or so

Started by manilx, December 03, 2022, 11:19:45 PM

Previous topic - Next topic
Does this problem affect people who are not running Suricata, I ask because I thought I had read where someone said it does?
I'm a home user of OPNsense, not a networking expert.  I'd much appreciate it if you'd keep that in mind if replying to something I posted.  Many thanks!

Are you using Proxmox and running OPNsense in a VM?
If you are, please disable memory ballooning.
You can find the settings in:
OPNsense VM
Hardware -> Memory
Tick Advanced
Untick Ballooning Device

It seems that FreeBSD 13.1 and MangoDB does not like Proxmox using Ballooning memory and will exhibit modules in OPNsense failing than crashing the whole VM.

Quote from: manilx on December 26, 2022, 03:27:39 PM
@franco

UPDATE:

I have been running 22.7.10_2 and the previous update with the Suricata 6.0.9_1

I have found that for the last few days I get to a time when I see memory usage going up, swap space being used (I have 12GB assigned and pratically never use swap) and then suddenly some domains start not resolving.
Then even more stop resolving. A reboot fixes this.
I have switched from Unbound to DNSmasq but the same happens after a day or so.
I have reverted to OPNsense 22.7.8 have locked the suricata at 6.0.8_1 and updated again to 22.7.10_2

Running fine now again, with normal memory usage and all domains resolving.

So Suricata 6.0.9_1 DOES have issues still........

Quote from: LiFE1688 on December 27, 2022, 07:33:05 AM
Are you using Proxmox and running OPNsense in a VM?
If you are, please disable memory ballooning.
You can find the settings in:
OPNsense VM
Hardware -> Memory
Tick Advanced
Untick Ballooning Device

It seems that FreeBSD 13.1 and MangoDB does not like Proxmox using Ballooning memory and will exhibit modules in OPNsense failing than crashing the whole VM.


Yes it's on Proxmox. Ballooning was always off, this is not it.

AND I spoke too soon: suricata 6.0.8_1 and OPNsense 22.7.10_2 after a bit more than a day had the same issue: memory trippled and DNS no longer resolving. All services running fine from what I could see.
So there seems to be another issue here apart from suricata. Unbound (also has a bigger update)?

I have reverted to a snapshot from Dec 2nd, were all was fine for ages. Running 22.7.9 with the plugins from that release and see how it goes.

Hope you find out what is wrong with it, for me it was ballooning memory. OPNsense and pfSense would both hang after a few hours, I don't use unbound so I have that disabled. I am using either pihole or adguard as my DNS.

After disabling, didn't need to downgrade Suricata and/or ntopng. So I am lucky on that. I do have that annoying constant non-stop

"/usr/local/opnsense/scripts/routes/gateway_status.php: plugins_run return_gateways_status (execute task : dpinger_status())"

thing in the logs in OPNsense. Other than being annoying, it doesn't have anything bad happening yet.

Quote from: manilx on December 27, 2022, 09:40:43 AM
Yes it's on Proxmox. Ballooning was always off, this is not it.

AND I spoke too soon: suricata 6.0.8_1 and OPNsense 22.7.10_2 after a bit more than a day had the same issue: memory trippled and DNS no longer resolving. All services running fine from what I could see.
So there seems to be another issue here apart from suricata. Unbound (also has a bigger update)?

I have reverted to a snapshot from Dec 2nd, were all was fine for ages. Running 22.7.9 with the plugins from that release and see how it goes.

It started stopping dns resolving in less than an hour now....

Resorted in starting VM from scratch and restoring backup and reconfiguring the rest.....

What a nightmare this has been since I updated start of this month!

I have seen the memory balloon issue. I havent been able to figure it out other than its Suricata. I ended up increasing memory to 16GB as like you saw with 12GB, it was pushing onto swap.
Funnily, once I increased memory about a week ago I havent had any issue with memory, in fact its not gone higher than about 4GB. Its very strange. I wonder if I went back to 12GB it would ballon again.
I never had issue with DNS outside of using the original Suricata 6.0.9 when it would stall.

After the reinstall and restore of configs I couldn't get my Zerotier no longer working. While it connected and all seemed well the static routes (defined in the ZT network) to reach my LAN didn't work. 100% equal config as before (uninstalled reinstalled ZT and configured all from scratch to try).
This tipped me over the edge. Spent one day and have now pfsense running with the same configuration as OPNsense (pfblockerng instead of opnsense way of doing).
Running fine!

Will have this running now as main fw for a week and switch back to see if OPNsense stopped having the above issues with resolving DNS.

I NEED a failsafe backup (just switch cables) I can't have the issues as above and all backups from my company failing........

Time will tell with whom I'll stay ;)



I too would like to know if this has been resolved.
I'm a home user of OPNsense, not a networking expert.  I'd much appreciate it if you'd keep that in mind if replying to something I posted.  Many thanks!

Doesn't anyone have any idea if this has been fixed yet, or whether it even affects people who are not running Suricata?  Those of us who are temporarily away from home are afraid to upgrade our routers for fear of them losing contact with the Internet after a day or so.

OPNsense has always been so reliable and problem free, and now this!
I'm a home user of OPNsense, not a networking expert.  I'd much appreciate it if you'd keep that in mind if replying to something I posted.  Many thanks!

I've spent weeks with an unreliable system from one day to the other. No clear solutions.
Set up from scratch restoring a backup (obviously) but the backup could only be restored with all packages up to date (and reintroducing the possible issue).....
As I've said I switched to pfsense for now and like what I have found.
There's always a solution.

Noticed internet access was very slow since doing the upgrade couple days ago.

Thankfully came across this thread. Not an opnsense expert. Running opnsense on a dedicated physical machine.

Rolled back to 22.7.9 and things seem to back to 'normal'.

opnsense-revert -r 22.7.9 opnsense
opnsense-update -kr 22.7.9
# then reboot

Cheers,
OPNsense + TP-Link W9970

22.7.9 was the last one I used with NO issues at all. Good times....

I have been combing the forum and reddit to understand what was occurring to me.  Just came across you guy's post...thanks.  I too was fine before I updated to 22.7.10.  Looking at how to go back now.  Thanks

Quote from: gurpal2000 on January 05, 2023, 07:34:43 PM
Noticed internet access was very slow since doing the upgrade couple days ago.

Thankfully came across this thread. Not an opnsense expert. Running opnsense on a dedicated physical machine.

Rolled back to 22.7.9 and things seem to back to 'normal'.

opnsense-revert -r 22.7.9 opnsense
opnsense-update -kr 22.7.9
# then reboot

Cheers,

So I didn't get the same results after I ran the update...

# opnsense-update -kr 22.7.9
Nothing to do.
root@OPNsense:~ # reboot

Came back up as 22.7.10_2.  What did I miss?