Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - mokai

Pages1

General Discussion / CARP Firewall Outage with massive Process Pile-up of update_tables.py & pfctl

December 19, 2025, 10:31:39 AM

Hi everyone,

yesterday night we experienced a severe outage of ~ 3-4 Minutes on our OPNsense firewall where the system load spiked to over 100, rendering the web GUI unresponsive and completely blocking all network traffic.

System Environment:

OS: OPNsense (25.7.3_7)

Services: Multiple OpenVPN Servers, large number of Aliases/Tables (each with small amount of entries, <15).

Observed Symptoms:

- Load Average > 100.
- Hundreds of python3 ... update_tables.py --types authgroup processes.
- Multiple pfctl -t [ALIAS] -T replace -f /var/db/aliastables/[...].txt processes stuck in state R (running) or D (disk wait).
- Frequent ovpn_event.py triggers (add/delete/update).

The pfctl could be seen 3 or 4 times with the same table, which gives me the impression that there is a bug.

We appended a redacted logfile. This is a productive HA/CARP firewall in the datacenter so that this issue gives me quite a headache.

Is this a known issue?
Why would there be no flock or another locking mechanism on the update_tables.py?
Do we use the wrong hardware for this?

Specs : 8 x E-2234 @ 3.6GHz, 8GB RAM, 200 GB SSD, 1 x 40 GE LAGG, 1 x 10GE LAGG, 1 x 1GE CARP
Usage : 12 VLANs, 3 Wireguard Server (200 clients), 3 OpenVPN Server (200 clients little traffic), Average Traffic ~1GB/s on all interfaces

Thanks for any help

25.7, 25.10 Series / High CPU Load every morning at 8am by update_tables.py leads to network timeouts

September 05, 2025, 06:29:56 PM

Hi,

for a couple of weeks we've been seeing irregular CPU load spikes on our primary HA firewall node. The issue occurs most frequently around **8 AM** and, as far as we can tell, it is caused by a large number of concurrent Python processes (about 20), mostly update_tables.py .

This looks very similar to the issue described here:
[https://forum.opnsense.org/index.php?topic=45620.0](https://forum.opnsense.org/index.php?topic=45620.0)

**Differences in our setup:**

* We don't have huge alias lists: ~450 aliases with about 3–15 entries each (~3,000–4,000 entries total).
* Hardware: 2 × Dell R340 (Xeon E2234, 8 cores, 16 GB RAM, SSD storage).
* Network: ~15 VLANs, 1 × 40 GbE LAGG (Mellanox ConnectX-3), 1 × 10 GbE LAGG (Mellanox ConnectX-3), ~20 virtual CARP IPs.

**Symptoms:**

* When the spike occurs, latency rises sharply and interfaces even stop responding.
* Example:

Code Select

  -- gw01 ping statistics ---
  188 packets transmitted, 179 received, 4.8% packet loss

* After 1–2 minutes, the Python processes calm down and latency returns to normal.
* The issue sometimes occurs at other times of the day, but right now **8 AM is a daily pattern**.

At this point we don't really know how to debug this further, nor can we pinpoint what exactly is triggering such a high load.

**Question:**
Has anyone experienced something similar or can provide hints on how to further analyze this?

Thanks!

(Version is 25.7.2, but problem occured already in an earlier release)

24.1, 24.4 Legacy Series / Re: Ipmi Tool missing After Update to 24.1

July 04, 2024, 05:34:20 PM

since this topic here let me in the wrong direction for 3hours, i link this for solving the issue

https://forum.opnsense.org/index.php?topic=40240.msg197388#msg197388

Code Select


pkg install freeipmi

24.1, 24.4 Legacy Series / Re: Ipmi Tool missing After Update to 24.1

July 04, 2024, 05:32:05 PM

Is there any news on this issue? it started on a late 23.x release i think, an update to the latest 24.1.9_4 does not help. for server monitoring this tool is essential for us..

Thanks Kai

General Discussion / [Feature] Finegrained restart control for wireguard

September 05, 2023, 04:50:17 PM

Hi,

we are running HA-Firewall with OpnSense and 4 Wireguard Servers and are pretty happy so far, but as soon as we add a new connection and "Apply" it, all 4 wireguard server get restarted, leading to a disconnect time for all users (for up to <keepalive> seconds).

Is it possible to apply only configs for the WG server which has changes? If not, what would be needed to have this in a feature version of OPNSense?

Thanks

German - Deutsch / Re: Nach Update - Fehler beim starten

March 01, 2023, 07:06:53 PM

Das ist uns heute auch passiert und mittlerweile kann ich zuordnen, dass wir ein update gemacht haben und der Primary Output auf der Seriellen Console lag. Auf der Video Console sah man nur

Code Select


Dual Console: Serial Primary, Video Secondary

und das System schien wegen "Root mount waiting for usbus0" zu hängen, worauf hin wir clevererweise manuell rebootet haben, vermutlich mitten im Update ...

21.7 Legacy Series / Re: OPNsense cannot connect via TLS to any server with an Let's Encrypt certificate.

September 30, 2021, 08:02:46 PM

i think this is a bigger issue hitting a lot of people and should be raised in priority

https://twitter.com/Scott_Helme/status/1443293844292919304

21.7 Legacy Series / Re: OPNsense cannot connect via TLS to any server with an Let's Encrypt certificate.

September 30, 2021, 08:01:44 PM

It also affects ldaps:// connections (!!) and let in our case to broken vpn authentication for our users.

seems the openssl version is the problem. the only thing that helped us for now is to import the ldap servers LE-Certificate into Trust->Authorities. It is then directly trusted by the ssl client and the (assumably broken) chain is not checked.

Pages1