MongoDB database locked

Started by doug_phoenix, August 26, 2023, 09:57:32 PM

Previous topic - Next topic
A few moments ago my firewall locked up and we lost internet and local network access.

I was able to reboot through the console and it looks like we are back up.

Looking at the logs I see the following warnings:
Date
Severity
Process
Line
2023-08-26T12:02:44-07:00 Error configd.py Timeout (120) executing : zenarmor database-disk-size /usr/local/datastore/mongodb
2023-08-26T11:57:18-07:00 Error configd.py Timeout (120) executing : zenarmor service mongod status
2023-08-26T11:50:25-07:00 Error configd.py Timeout (120) executing : zenarmor service mongod status
2023-08-26T11:02:11-07:00 Error configd.py [698dfda4-490d-4e98-8266-618ad74e3b65] Script action stderr returned "b'Parse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error nea'"


... plus a number of subsequent timeout errors of the same type.

Is any action recommended other than the occasional reboot? I'm using native netmap drivers.


Type   opnsense   
Version   23.7.1_3   
Architecture   amd64   
Commit   239f8d1f8   
Mirror   https://pkg.opnsense.org/FreeBSD:13:amd64/23.7   
Repositories   OPNsense, SunnyValley   
Updated on   Thu Aug 17 09:32:10 MST 2023

ZenArmor:
Engine
Status:   Running
Version:   1.14.4 - Aug 23, 2023 10:12 AM

Disk space
Usage:   9.0 GB / 421.9 GB
Mount:   /

Usage:   599.3 MB / 413.4 GB
Mount:   /var/log

etc.


August 27, 2023, 12:22:12 AM #1 Last Edit: August 27, 2023, 12:24:51 AM by doug_phoenix
Brief update:

Since rebooting my CPU has been pegged at 100% most of the time. Previously, it had been varying from 0-60% or so. Stopping MongoDB gives 25-40%. I'll try leaving it off. No database, of course, but better than a locked-up firewall.

Along the way, I switched from native to emulated netmap. No difference, so for now I'm staying with emulated.

FYI I've read that Elasticsearch is preferred, but I was not presented with that option when I reinstalled ZA recently.

August 27, 2023, 10:08:23 AM #2 Last Edit: August 27, 2023, 10:10:04 AM by almodovaris
There is a third option, namely SQL database. It is by far the least problematic.

And both MongoDB and SQL should be left to only two days. ElasticSearch can have more, but that's only for higher-end machines.
OPNsense HW:

Minisforum Venus series UN100C, 16 GB RAM, 512 GB SSD
T-bao N9N Pro, 16 GB RAM, 512 GB SSD

Thank you! I hadn't appreciated that SQL was less problematic. I'll reinstall ZA with SQL.

And yes, I'll keep retention to two days, as I did with MongoDB.

After several hours, I'm seeing a similar issue with SQL. CPU pegged at 100%. Resetting the database returns things to normal.

I think my hardware should be powerful enough to run ZenArmor. Smartctl passes.
Any other ideas?

Protectli VP2410
Intel Celeron J4125
16 GB DRAM
480 GB M.2 SSD
coreboot BIOS

Type   opnsense   
Version   23.7.2   
Architecture   amd64   
Commit   81a9dcc9c   
Mirror   https://pkg.opnsense.org/FreeBSD:13:amd64/23.7   
Repositories   OPNsense, SunnyValley   
Updated on   Sun Aug 27 09:01:37 MST 2023

I run Zenarmor on an APU2, which is even slower (lower end than yours).
OPNsense HW:

Minisforum Venus series UN100C, 16 GB RAM, 512 GB SSD
T-bao N9N Pro, 16 GB RAM, 512 GB SSD

Thank you, that is helpful.  :)

This morning my network was down again. The console showed screens full of netmap errors (emulated adapter ... destroyed... Native netmap emulator ... created... emulated adapter ... created).

I'm trying native netmap now. I should know in a few hours if that is my issue.

Only took an hour this time. CPU pegged at 100% again.

Trying passive mode now.

OPNsense is running well so far with ZA in passive mode. I added some blocklists to Unbound DNS, and I think that I'm blocking most of the traffic that ZA had been blocking, as far as I can tell.

I can continue operating in this mode for some time, but I'm curious: why does ZA routing cause my system to lock up? I've run a few hardware tests, including smartctl (long and short tests), Memtest 86+, and s-tui stress. I'm on my second SSD.

I've disabled hardware CSC, TSO, LRO, and VLAN filtering.

I do run LAGG (LACP) on two of the four ports. Could that be the issue?

Hi,

What is the top process when CPU is %100?

Can you share the logs when Zenarmor in routed mode?
https://www.zenarmor.com/docs/support/reporting-bug#as-of-v114


August 31, 2023, 10:15:29 PM #10 Last Edit: September 01, 2023, 12:27:34 AM by doug_phoenix
@sy,

I've temporarily removed ZenArmor after a crash. I did submit a crash report. I recall a number of netmap messages as on my third post from Aug 28 (here).

I'll be away for the next two weeks, but I'll revisit after I return. Thanks for your interest and support.

I ran into the same problem and used the builtin feedback to send logs and configuration. After disabling ZenArmor the firewall runs smoothly again. I hope this gets sorted out soon.

September 01, 2023, 05:16:23 PM #12 Last Edit: September 01, 2023, 05:18:07 PM by almodovaris
According to https://dash.zenarmor.com/firewalls/ , Zenarmor was using 85% of 9.47 GiB RAM. So, if you're short of RAM, that might be your problem.

Odd enough, the APU2, with only 4 GB RAM does not have this problem.
OPNsense HW:

Minisforum Venus series UN100C, 16 GB RAM, 512 GB SSD
T-bao N9N Pro, 16 GB RAM, 512 GB SSD

Nope, got 16 GB of RAM. I deleted Zenarmor and reinstalled it using Elasticsearch as a backend database. Let's see how that goes ...

Protectli VP2410
Intel Celeron J4125
16 GB DRAM
coreboot BIOS

@chaosphere64,

All along I thought it was just me. Good to know at least someone else sees the same problem. But I'm sorry, too.

Are you using LAGG?

One of my suspicions has been the bios itself. I ran into a memory problem early-on, now resolved. I'm wondering if there could be a disk management issue too.
I've considered flashing AMI, but there are risks.

I'm out of town for a couple of weeks; hoping that you or someone else manages to resolve the problem.