A few moments ago my firewall locked up and we lost internet and local network access.
I was able to reboot through the console and it looks like we are back up.
Looking at the logs I see the following warnings:
Date
Severity
Process
Line
2023-08-26T12:02:44-07:00 Error configd.py Timeout (120) executing : zenarmor database-disk-size /usr/local/datastore/mongodb
2023-08-26T11:57:18-07:00 Error configd.py Timeout (120) executing : zenarmor service mongod status
2023-08-26T11:50:25-07:00 Error configd.py Timeout (120) executing : zenarmor service mongod status
2023-08-26T11:02:11-07:00 Error configd.py [698dfda4-490d-4e98-8266-618ad74e3b65] Script action stderr returned "b'Parse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error near line 1: database is locked (5)\nParse error nea'"
... plus a number of subsequent timeout errors of the same type.
Is any action recommended other than the occasional reboot? I'm using native netmap drivers.
Type opnsense
Version 23.7.1_3
Architecture amd64
Commit 239f8d1f8
Mirror https://pkg.opnsense.org/FreeBSD:13:amd64/23.7
Repositories OPNsense, SunnyValley
Updated on Thu Aug 17 09:32:10 MST 2023
ZenArmor:
Engine
Status: Running
Version: 1.14.4 - Aug 23, 2023 10:12 AM
Disk space
Usage: 9.0 GB / 421.9 GB
Mount: /
Usage: 599.3 MB / 413.4 GB
Mount: /var/log
etc.
Brief update:
Since rebooting my CPU has been pegged at 100% most of the time. Previously, it had been varying from 0-60% or so. Stopping MongoDB gives 25-40%. I'll try leaving it off. No database, of course, but better than a locked-up firewall.
Along the way, I switched from native to emulated netmap. No difference, so for now I'm staying with emulated.
FYI I've read that Elasticsearch is preferred, but I was not presented with that option when I reinstalled ZA recently.
There is a third option, namely SQL database. It is by far the least problematic.
And both MongoDB and SQL should be left to only two days. ElasticSearch can have more, but that's only for higher-end machines.
Thank you! I hadn't appreciated that SQL was less problematic. I'll reinstall ZA with SQL.
And yes, I'll keep retention to two days, as I did with MongoDB.
After several hours, I'm seeing a similar issue with SQL. CPU pegged at 100%. Resetting the database returns things to normal.
I think my hardware should be powerful enough to run ZenArmor. Smartctl passes.
Any other ideas?
Protectli VP2410
Intel Celeron J4125
16 GB DRAM
480 GB M.2 SSD
coreboot BIOS
Type opnsense
Version 23.7.2
Architecture amd64
Commit 81a9dcc9c
Mirror https://pkg.opnsense.org/FreeBSD:13:amd64/23.7
Repositories OPNsense, SunnyValley
Updated on Sun Aug 27 09:01:37 MST 2023
I run Zenarmor on an APU2, which is even slower (lower end than yours).
Thank you, that is helpful. :)
This morning my network was down again. The console showed screens full of netmap errors (emulated adapter ... destroyed... Native netmap emulator ... created... emulated adapter ... created).
I'm trying native netmap now. I should know in a few hours if that is my issue.
Only took an hour this time. CPU pegged at 100% again.
Trying passive mode now.
OPNsense is running well so far with ZA in passive mode. I added some blocklists to Unbound DNS, and I think that I'm blocking most of the traffic that ZA had been blocking, as far as I can tell.
I can continue operating in this mode for some time, but I'm curious: why does ZA routing cause my system to lock up? I've run a few hardware tests, including smartctl (long and short tests), Memtest 86+, and s-tui stress. I'm on my second SSD.
I've disabled hardware CSC, TSO, LRO, and VLAN filtering.
I do run LAGG (LACP) on two of the four ports. Could that be the issue?
Hi,
What is the top process when CPU is %100?
Can you share the logs when Zenarmor in routed mode?
https://www.zenarmor.com/docs/support/reporting-bug#as-of-v114
@sy,
I've temporarily removed ZenArmor after a crash. I did submit a crash report. I recall a number of netmap messages as on my third post from Aug 28 (here).
I'll be away for the next two weeks, but I'll revisit after I return. Thanks for your interest and support.
I ran into the same problem and used the builtin feedback to send logs and configuration. After disabling ZenArmor the firewall runs smoothly again. I hope this gets sorted out soon.
According to https://dash.zenarmor.com/firewalls/ , Zenarmor was using 85% of 9.47 GiB RAM. So, if you're short of RAM, that might be your problem.
Odd enough, the APU2, with only 4 GB RAM does not have this problem.
Nope, got 16 GB of RAM. I deleted Zenarmor and reinstalled it using Elasticsearch as a backend database. Let's see how that goes ...
Protectli VP2410
Intel Celeron J4125
16 GB DRAM
coreboot BIOS
@chaosphere64,
All along I thought it was just me. Good to know at least someone else sees the same problem. But I'm sorry, too.
Are you using LAGG?
One of my suspicions has been the bios itself. I ran into a memory problem early-on, now resolved. I'm wondering if there could be a disk management issue too.
I've considered flashing AMI, but there are risks.
I'm out of town for a couple of weeks; hoping that you or someone else manages to resolve the problem.
To be honest I ran the same hardware with OPNsense for almost a year now, never had any problems before the latest updates. I have opened a ticket, maybe Sunnyvalley can find something.
Other than that after a complete reinstall and reconfiguration (no restore) plus switch to Elasticsearch I have not had a problem yet. The vendor itself recommends that latter if you run into problems with MongoDB:
https://www.zenarmor.com/docs/introduction/hardware-requirements (green "Tip" box).
BR
Thanks. I used to run Elasticsearch, but with reinstallation I was not presented with the option to use local Elasticsearch. I'm running a Protectli VP2410 with 16 GB DRAM and 480 GB SSD. Should be plenty of storage, memory, and even processing speed.
Let us know if you learn more. Thanks.
If you know what you're doing, you may doctor the results of the hardware requirements test.
I just reinstalled this morning. I'm using SQL again because I do not know how to enable local Elasticsearch on my hardware (and I don't want the overhead of setting up and maintaining a remote database).
I limited data retention to 1-day, even though I have plenty of storage.
So far, ZA is running well - CPU use is low, no obvious speed issues, memory use is < 35%. Will update after it's been running for awhile.
Well, it only took a few hours for OPNsense to lock up.
I had errors displayed on the console that were similar to before (netmap emulated adapter destroyed ... created etc.)
Yes, I was running the latest update.
I submitted a full report to Sunnyvalley (checked all options), as requested previously. Removing ZenArmor now.
I'm disappointed. I purchased this hardware specifically to run ZenArmor on OPNsense.
In my experience, if I give the VM 8 GB or 10 GB RAM, it misbehaves. Cutting its RAM to 4200 MB solved the problem, it learned to behave. I advise you to do the same: maybe it does not like having 16 GB RAM, give it only 4 GB.
In theory, the above should be bad advice. In practice, it works.
I'm running OPNsense on bare metal (no VM). Did you mean to say that I could constrain available memory? I'm not quite sure how to do that. (Anything at the BIOS-level might be risky given other experience with this hardware.)
Thanks!
Why don't you install ProxMox on it? Just save your OPNsense config in the cloud beforehand.
Quote from: doug_phoenix on September 20, 2023, 12:38:13 AM
I'm running OPNsense on bare metal (no VM). Did you mean to say that I could constrain available memory? I'm not quite sure how to do that. (Anything at the BIOS-level might be risky given other experience with this hardware.)
Thanks!
BIOS related other than ensuring you have the latest from Protectli there's not much to be done.
What version of OPNsense/Zenarmor are you running currently ? This sounds like a netmap issue
I'm running OPNsense 23.7.4-amd64. I've removed ZenArmor, but I downloaded snd installed the Plugins just yesterday, and ZA reported that the engine was the "latest."
Yes, it seems to me like a netmap issue too. I've had issues with native and emulated drivers.
Thank you.
Reply to @almodovaris
I suppose Proxmox is an option to leverage spare resources on the Protectli box. I might try that some day. But to solve the ZA issues (which seems netmap related) it looks like another rabbit hole. I've been down a few already...
Having never run VM's, my understanding is that Proxmox would consume a little overhead, and I would have concerns about the demands of both OPNsense/ZenArmor on one VM and another VM running Elasticsearch. My box runs a Celeron J4125.
Thanks.
Update:
SunnyValley has provided a patch that I have tested on my system. I am no longer seeing database or Netmap-related crashes. :)
I had another issue. My temporary memory filled up within a few hours. Along with this, CPU use increased (60-85% with spikes to 100%, processer increased from 45 to 58 C. No crashes, but the firewall became sluggish. This was resolved by setting web controls from "moderate" to "permissive."
Feedback from tech support is that this issue is due to hardware insufficiency. My system uses a Celeron J4125.