Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - anicoletti

#1
Sadly it is not happening to my local unit and since I'm losing full connectivity on all WANs and LANs I cannot access the unit to pull any additional information. I was able to have a user look at the console and I did not see any alerts of interface loss or connectivity issues.
#2
We have two clients running 24.7.6 at this time and having random issues where the firewall will just stop responding to either internal or external traffic. I'm not seeing any interface disconnects in the logs, and a restart resolves the issues. These also different hardware (one is a Protectli FW4C and the other is a ThinkServer). This issue occurs every 5-7 days. The Protectli dropped less than an hour ago and I was able to see these items in the logs.

BACKEND:
2024-11-13T10:43:16-05:00 Notice configd.py [47db4b21-ab0e-4689-884b-9697ea0ec799] reconfiguring routing due to gateway alarm
2024-11-13T10:43:16-05:00 Notice configd.py ALERT: WAN_NeFrontier_GWv4 (Addr: 8.8.8.8 Alarm: loss -> down RTT: 6.2 ms RTTd: 0.4 ms Loss: 47.0 %)
2024-11-13T10:43:06-05:00 Error configd.py Timeout (120) executing : interface routes alarm
2024-11-13T10:41:06-05:00 Notice configd.py [9b964afb-2380-405a-8cf4-f2526810cd3d] list gateways
2024-11-13T10:41:05-05:00 Notice configd.py [a2547427-57b7-4d83-aa0d-4e09ca813a9a] show system routing table
2024-11-13T10:41:05-05:00 Notice configd.py [050c04c9-ff9d-497d-aabc-bb3164cc2736] reconfiguring routing due to gateway alarm


GENERAL:
2024-11-13T10:41:06-05:00 Notice opnsense /usr/local/etc/rc.routing_configure: plugins_configure monitor (execute task : dpinger_configure_do(1,[]))
2024-11-13T10:41:06-05:00 Notice opnsense /usr/local/etc/rc.routing_configure: plugins_configure monitor (1,[])
2024-11-13T10:41:06-05:00 Notice opnsense /usr/local/etc/rc.routing_configure: ROUTING: keeping inet default route to 47.207.54.1
2024-11-13T10:41:06-05:00 Notice opnsense /usr/local/etc/rc.routing_configure: ROUTING: configuring inet default gateway on wan
2024-11-13T10:41:05-05:00 Notice opnsense /usr/local/etc/rc.routing_configure: ROUTING: entering configure using defaults


AUDIT:
2024-11-13T10:43:16-05:00 Informational configd.py action allowed interface.routes.alarm for user root
2024-11-13T10:41:06-05:00 Informational configd.py action allowed interface.gateways.list for user root
2024-11-13T10:41:05-05:00 Informational configd.py action allowed interface.routes.list for user root
2024-11-13T10:41:05-05:00 Informational configd.py action allowed interface.routes.alarm for user root


I can schedule time to upgrade the firewalls to 24.7.8, but wanted to see if anyone else has seen this issue.
#3
Thanks for the clarification. I'll go and remove this file from our firewalls if they exist.
#4
Yes, but I would assume (and that might be my problem) it followed log retention policies like the other logs.
#5
We received notification via Zabbix that one of our OPNsense firewalls was at 10% disk space free. We attempted to connect to it but the WebGUI was failing to load. We were able to access it via SSH, and upon running df we noticed the system was completely full. I manually deleted a few log files and restarted the WebGUI to get logged in. We had this issue about 2 months ago with this location but we actually rebuilt the firewall completely on new hardware, just restoring the original configuration.

After reviewing this issue further today, I went ahead and purged the rest of the logs, including RRD and Netflow data, but there was still 62GB of unaccounted for space used.

Ended up hopping back onto the shell and ran the following command:
du -h / | grep '[0-9\.]\+G'

The results showed that 62G was under /var/cache/unbound.duckdb. On checking that folder, I found these files.

-rw-r--r--  1 root     unbound         2888 May 21 08:45 client.csv
-rw-r--r--  1 unbound  unbound          214 May 20 08:44 load.sql
-rw-r--r--  1 unbound  unbound  33267605145 May 20 08:44 query.csv
-rw-r--r--  1 unbound  unbound         1503 May 20 08:44 schema.sql
-rw-r--r--  1 root     unbound  33726476128 May 21 08:45 tmp_query.csv


Two query.csv files totalling 62GB seems a bit off to me. Any ideas on why these got so bad and how to prevent this issue in the future?
#6
Interesting. So two issues. First, after adding the items back under the Overrides GUI, I'm able to restart Unbound without issue. Second, when I attempt to run the command with -ddvv, I get the following error:

root@opnsense:/var/unbound/etc # unbound -ddvv -c /var/unbound/unbound.conf
[1690891368] unbound[46649:0] notice: Start of unbound 1.17.1.
[1690891368] unbound[46649:0] debug: chdir to /var/unbound
[1690891368] unbound[46649:0] debug: chroot to /var/unbound
[1690891368] unbound[46649:0] debug: drop user privileges, run as unbound
[1690891368] unbound[46649:0] debug: switching log to stderr
[1690891368] unbound[46649:0] debug: module config: "python iterator"
[1690891368] unbound[46649:0] notice: init module 0: python
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Python path configuration:
  PYTHONHOME = (not set)
  PYTHONPATH = (not set)
  program name = 'unbound'
  isolated = 0
  environment = 1
  user site = 1
  import site = 0
  sys._base_executable = ''
  sys.base_prefix = '/usr/local'
  sys.base_exec_prefix = '/usr/local'
  sys.platlibdir = 'lib'
  sys.executable = ''
  sys.prefix = '/usr/local'
  sys.exec_prefix = '/usr/local'
  sys.path = [
    '/usr/local/lib/python39.zip',
    '/usr/local/lib/python3.9',
    '/usr/local/lib/lib-dynload',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

Current thread 0x0000000829022000 (most recent call first):
<no Python frame>



I can see about pulling a backup of the configuration from prior to the upgrade to see if there is anything odd in how it generates the domainoverrides.conf file. I also have quite a few other units to upgrade so I can monitor those and post additional details if I run into it again.
#7
Ran into some issues upgrading to 23.7 and Unbound not starting. Figured I'd share this information as I did not see anyone else post this specific issue yet.

I upgraded from 23.1.11_1 to 23.7 on one of our client firewalls this evening. Upon completion, DNS services failed to start on the firewall. I was able to remote into another system and connect into the firewall and noticed Unbound was not running. Attempting to start it spun from 10-15 seconds then returned with it still offline. Connected to the firewall via SSH and ran the following command to check the status on starting the service:

Command:
unbound -c /var/unbound/unbound.conf

Results:
/var/unbound/etc/domainoverrides.conf:1: error: syntax error
read /var/unbound/unbound.conf failed: 1 errors in configuration file
[1690860690] unbound[25940:0] fatal error: Could not read config file: /var/unbound/unbound.conf. Maybe try unbound -dd, it stays on the commandline to see more errors, or unbound-checkconf


We still have some clients where domain overrides are under the Overrides section in the GUI and not moved over to the Query Forwarding yet. Upon removing the entries from the Overrides section and adding them back in under Query Forwarding, I was able to successfully start the Unbound services and query the internal domain overrides.
#8
22.7 Legacy Series / States Table Issues after Upgrade
December 06, 2022, 02:29:24 PM
We've been having a recurring issue across multiple of our OPNsense firewalls regarding UDP traffic (specifically VOIP) after an upgrade. It appears that on upgrading to a newer version, the state table is not being cleared and this traffic is failing. Going in after the reboot and using Firewall \ Diagnostics \ States \ Actions \ Reset state table clears them and the traffic begins to work again. Is there any suggestions on how to avoid having to do this after every update?
#9
22.7 Legacy Series / Re: IPSec tunnels do not re-initiate
November 10, 2022, 07:07:43 PM
We also had issues with the IPSEC Tunnels not re-establishing even with DPD setup. We ended up using Monit to monitor the IPSEC tunnels and restart if the tunnel ping failed.
#10
So we have different user accounts that have limited access to the firewall for specific purposes. One would be a user that the only reason they need to get on the firewall would be to check the status of IPSEC tunnels and restart them as needed. In 22.1, I was able to just gran the user group access to "Status: IPSEC" and it worked. One of the firewalls I'm using is on 22.7.4 and I'm doing the same but the IPSEC Status Overview page is loading empty. My assumption is this is due to the recent change (which I absolutely love) on how the Overview page displays the status.

Before I go ahead and submit a bug via Github, is anyone else having this issue?
#11
Do you know the old IP scheme? You could just try adding a Virtual IP to your LAN setup to get access to the old device using the original IP information.

You can add an alias under Interfaces \ Virtual IPs \ Settings. From there, click the [ + ] to add a new Virtual IP. Leave Mode ad "IP Alias" and make sure the Interface is LAN. Under Address, add the old IP scheme, making sure you put the IP address for the old firewall, and switching the subnet mask dropdown to 24 (unless it was something else prior). Save, apply, try.
#12
We are running 22.1.9 and I'm trying to get os-ddclient configured. I was able to get it setup initially, but when I tried to start the service, or return to the GUI page to check settings, I'm getting the following error:

/usr/local/opnsense/mvc/app/controllers/OPNsense/Base/ApiControllerBase.php:152: Error at /usr/local/opnsense/mvc/app/models/OPNsense/DynDNS/FieldTypes/AccountField.php:55 - Undefined index: ip (errno=8)

Also, during configuration I'm noticing that the Gateway Groups are not showing in the Interface to Monitor dropdown. The older client allowed this and we use this to handle failover interfaces. Is there a way to set this up using the new client?
#13
What logging level do you have set at the Top-Right of the Log File page? Might be that you have it set to only display a higher level than what is being recorded in the log files? Try changing to Debug and see if you get anything.
#14
22.1 Legacy Series / Re: Backup / restore broken?
June 26, 2022, 07:17:55 PM
What version is the backup and what version are you running? I had an issue where I was restoring a backup of 22.1.8 to a 22.1 unit before upgrading and it caused issues because of the Unbound changes around version 22.1.4 I believe. I ended up upgrading the unit to latest then restored and did not have any issues.
#15
Good deal. I'll look into loading the development branch to one of the units we have set aside for HA and see if that helps. Thanks for the update!