Firewall stops responding/ Multiple Devices

Started by CanIKipThis, December 20, 2024, 04:55:31 PM

Previous topic - Next topic
Hey guys,

Had this problem across two completely different physical devices, but same symptoms.  Current system:

Opnsense 24.7.9_1
Protectli Vault Pro VP2410

2 x WAN Gateways (Frontier and Optimum).

The firewall is plugged into switches, not direct into the providers ONT or Gateways.  The gateway's are configured in a load balancing group, and everything works as intended there.  The behavior is randomly every 4 or 5 days, the services on the firewall (Web admin page, DNS, SNMP, routing etc) stop responding or should say start to be REALLY slow to respond.  I utilize Check MK to check various services and here are some graph examples:

https://imgur.com/a/4HzK1Ml

Above is the Unbound DNS service and its response time.  The red box is during the period of time that the issue starts.  Its tough to see on the graph but for about an hour, the DNS service takes over 5 seconds to respond to the test query, when normally its below .20.

https://imgur.com/a/YvlTlx0

Next you can see by the gaps in CPU utilization that the SNMP monitoring service stops responding as well.  I also get alerts that the HAProxy instance on the device has stopped responding from my internal and external monitoring.

During this period, the web admin interface can be REALLY slow.  If I go into gateway groups, it will state one of the gateways is offline (even though I don't think it actually is) the only way to recover is to reboot the entire appliance.

I am trying to track it down, but am having some trouble.  Here is the gateway log, the issue seemed to start at 5:20 AM, which is when this gateway.log is from:

<165>1 2024-12-20T05:20:21-05:00 firewall dpinger 57382 - [meta sequenceId="1"] ALERT: WAN_FRONTIER_DHCP (Addr: 9.9.9.9 Alarm: none -> loss RTT: 11.0 ms RTTd: 0.4 ms Loss: 12.0 %)
<12>1 2024-12-20T06:00:30-05:00 firewall dpinger 21244 - [meta sequenceId="1"] send_interval 1000ms  loss_interval 4000ms  time_period 60000ms  report_interval 0ms  data_len 1  alert_interval 1000ms  latency_alarm 0ms  loss_alarm 0%  alarm_hold 10000ms  dest_addr 9.9.9.9  bind_addr 32.218.108.73  identifier "WAN_FRONTIER_DHCP "
<12>1 2024-12-20T06:00:30-05:00 firewall dpinger 21946 - [meta sequenceId="2"] send_interval 1000ms  loss_interval 4000ms  time_period 60000ms  report_interval 0ms  data_len 1  alert_interval 1000ms  latency_alarm 0ms  loss_alarm 0%  alarm_hold 10000ms  dest_addr 1.1.1.1  bind_addr 69.113.210.79  identifier "WAN_OPTIMUM_DHCP "
<165>1 2024-12-20T06:00:31-05:00 firewall dpinger 22637 - [meta sequenceId="3"] Reloaded gateway watcher configuration on SIGHUP
<165>1 2024-12-20T06:00:32-05:00 firewall dpinger 22637 - [meta sequenceId="4"] ALERT: WAN_FRONTIER_DHCP (Addr: 9.9.9.9 Alarm: down -> none RTT: 10.8 ms RTTd: 0.6 ms Loss: 0.0 %)
<165>1 2024-12-20T06:00:32-05:00 firewall dpinger 22637 - [meta sequenceId="5"] ALERT: WAN_OPTIMUM_DHCP (Addr: 1.1.1.1 Alarm: down -> none RTT: 4.1 ms RTTd: 0.9 ms Loss: 0.0 %)
<12>1 2024-12-20T06:00:37-05:00 firewall dpinger 21244 - [meta sequenceId="6"] exiting on signal 15
<12>1 2024-12-20T06:00:37-05:00 firewall dpinger 73314 - [meta sequenceId="7"] send_interval 1000ms  loss_interval 4000ms  time_period 60000ms  report_interval 0ms  data_len 1  alert_interval 1000ms  latency_alarm 0ms  loss_alarm 0%  alarm_hold 10000ms  dest_addr 9.9.9.9  bind_addr 32.218.108.73  identifier "WAN_FRONTIER_DHCP "
<165>1 2024-12-20T06:00:37-05:00 firewall dpinger 22637 - [meta sequenceId="8"] Reloaded gateway watcher configuration on SIGHUP
<12>1 2024-12-20T06:00:37-05:00 firewall dpinger 21946 - [meta sequenceId="9"] exiting on signal 15
<12>1 2024-12-20T06:00:37-05:00 firewall dpinger 75904 - [meta sequenceId="10"] send_interval 1000ms  loss_interval 4000ms  time_period 60000ms  report_interval 0ms  data_len 1  alert_interval 1000ms  latency_alarm 0ms  loss_alarm 0%  alarm_hold 10000ms  dest_addr 1.1.1.1  bind_addr 69.113.210.79  identifier "WAN_OPTIMUM_DHCP "
<165>1 2024-12-20T06:00:37-05:00 firewall dpinger 22637 - [meta sequenceId="11"] Reloaded gateway watcher configuration on SIGHUP
<165>1 2024-12-20T06:00:40-05:00 firewall dpinger 22637 - [meta sequenceId="12"] ALERT: WAN_OPTIMUM_DHCP (Addr: 1.1.1.1 Alarm: none -> delay RTT: 405.9 ms RTTd: 561.6 ms Loss: 0.0 %)
<165>1 2024-12-20T06:00:52-05:00 firewall dpinger 22637 - [meta sequenceId="1"] Reloaded gateway watcher configuration on SIGHUP
<165>1 2024-12-20T06:00:52-05:00 firewall dpinger 22637 - [meta sequenceId="2"] ALERT: WAN_OPTIMUM_DHCP (Addr: 1.1.1.1 Alarm: delay -> none RTT: 91.5 ms RTTd: 307.5 ms Loss: 0.0 %)

Here is my DMESG log:

---<<BOOT>>---
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.1-RELEASE-p6 stable/24.7-n267939-fd5bc7f34e1 SMP amd64
FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)
VT(efifb): resolution 800x600
CPU: Intel(R) Celeron(R) J4125 CPU @ 2.00GHz (1996.80-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x706a8  Family=0x6  Model=0x7a  Stepping=8
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x4ff8ebbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x101<LAHF,Prefetch>
  Structured Extended Features=0x2294e287<FSGSBASE,TSCADJ,SGX,SMEP,ERMS,NFPUSG,MPX,PQE,RDSEED,SMAP,CLFLUSHOPT,PROCTRACE,SHA>
  Structured Extended Features2=0x40400004<UMIP,RDPID,SGXLC>
  Structured Extended Features3=0xac000400<MD_CLEAR,IBPB,STIBP,ARCH_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  IA32_ARCH_CAPS=0x6b<RDCL_NO,IBRS_ALL,SKIP_L1DFL_VME,MDS_NO>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
real memory  = 8589934592 (8192 MB)
avail memory = 8070303744 (7696 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <INTEL  GLK-SOC >
WARNING: L1 data cache covers fewer APIC IDs than a core (0 < 1)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
ioapic0 <Version 2.0> irqs 0-119
Launching APs: 3 2 1
random: entropy device external interface
wlan: mac acl policy registered
kbd0 at kbdmux0
WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 15.0.
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
smbios0: <System Management BIOS> at iomem 0x7a0d4000-0x7a0d401e
smbios0: Version: 3.2, BCD Revision: 3.2
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256>
acpi0: <ALASKA A M I >
unknown: I/O range not supported
cpu0: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0
atrtc0: Warning: Couldn't map I/O.
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 on acpi0
Timecounter "HPET" frequency 19200000 Hz quality 950
Event timer "HPET" frequency 19200000 Hz quality 550
Event timer "HPET1" frequency 19200000 Hz quality 440
Event timer "HPET2" frequency 19200000 Hz quality 440
Event timer "HPET3" frequency 19200000 Hz quality 440
Event timer "HPET4" frequency 19200000 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0xf000-0xf03f mem 0xa0000000-0xa0ffffff,0x90000000-0x9fffffff irq 19 at device 2.0 on pci0
vgapci0: Boot video device
hdac0: <Intel Gemini Lake HDA Controller> mem 0xa1510000-0xa1513fff,0xa1000000-0xa10fffff irq 25 at device 14.0 on pci0
pci0: <simple comms> at device 15.0 (no driver attached)
ahci0: <Intel Gemini Lake AHCI SATA controller> port 0xf090-0xf097,0xf080-0xf083,0xf060-0xf07f mem 0xa1514000-0xa1515fff,0xa151a000-0xa151a0ff,0xa1519000-0xa15197ff irq 19 at device 18.0 on pci0
ahci0: AHCI v1.31 with 2 6Gbps ports, Port Multiplier supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
pcib1: <ACPI PCI-PCI bridge> irq 21 at device 19.0 on pci0
pci1: <ACPI PCI bus> on pcib1
igb0: <Intel(R) I210 Flashless (Copper)> port 0xe000-0xe01f mem 0xa1400000-0xa141ffff,0xa1420000-0xa1423fff irq 22 at device 0.0 on pci1
igb0: NVM V0.6 imgtype6
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 4 RX queues 4 TX queues
igb0: Using MSI-X interrupts with 5 vectors
igb0: Ethernet address: 64:62:66:23:67:df
igb0: netmap queues/slots: TX 4/1024, RX 4/1024
pcib2: <ACPI PCI-PCI bridge> irq 21 at device 19.1 on pci0
pci2: <ACPI PCI bus> on pcib2
igb1: <Intel(R) I210 Flashless (Copper)> port 0xd000-0xd01f mem 0xa1300000-0xa131ffff,0xa1320000-0xa1323fff irq 23 at device 0.0 on pci2
igb1: NVM V0.6 imgtype6
igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 4 RX queues 4 TX queues
igb1: Using MSI-X interrupts with 5 vectors
igb1: Ethernet address: 64:62:66:23:67:e0
igb1: netmap queues/slots: TX 4/1024, RX 4/1024
pcib3: <ACPI PCI-PCI bridge> irq 21 at device 19.2 on pci0
pci3: <ACPI PCI bus> on pcib3
igb2: <Intel(R) I210 Flashless (Copper)> port 0xc000-0xc01f mem 0xa1200000-0xa121ffff,0xa1220000-0xa1223fff irq 20 at device 0.0 on pci3
igb2: NVM V0.6 imgtype6
igb2: Using 1024 TX descriptors and 1024 RX descriptors
igb2: Using 4 RX queues 4 TX queues
igb2: Using MSI-X interrupts with 5 vectors
igb2: Ethernet address: 64:62:66:23:67:e1
igb2: netmap queues/slots: TX 4/1024, RX 4/1024
pcib4: <ACPI PCI-PCI bridge> irq 21 at device 19.3 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 21 at device 0.0 on pci4
pci5: <ACPI PCI bus> on pcib5
pcib6: <PCI-PCI bridge> irq 22 at device 1.0 on pci5
pci6: <PCI bus> on pcib6
igb3: <Intel(R) I210 Flashless (Copper)> port 0xb000-0xb01f mem 0xa1100000-0xa111ffff,0xa1120000-0xa1123fff irq 22 at device 0.0 on pci6
igb3: NVM V0.6 imgtype6
igb3: Using 1024 TX descriptors and 1024 RX descriptors
igb3: Using 4 RX queues 4 TX queues
igb3: Using MSI-X interrupts with 5 vectors
igb3: Ethernet address: 64:62:66:23:67:e2
igb3: netmap queues/slots: TX 4/1024, RX 4/1024
pcib7: <PCI-PCI bridge> irq 20 at device 3.0 on pci5
pci7: <PCI bus> on pcib7
pcib8: <PCI-PCI bridge> irq 22 at device 5.0 on pci5
pci8: <PCI bus> on pcib8
pcib9: <PCI-PCI bridge> irq 20 at device 7.0 on pci5
pci9: <PCI bus> on pcib9
xhci0: <Intel Gemini Lake USB 3.0 controller> mem 0xa1500000-0xa150ffff irq 17 at device 21.0 on pci0
xhci0: 32 bytes context size, 64-bit DMA
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
sdhci_pci0: <Generic SD HCI> mem 0xa1518000-0xa1518fff,0xa1517000-0xa1517fff irq 39 at device 28.0 on pci0
sdhci_pci0: 1 slot(s) allocated
mmc0: <MMC/SD bus> on sdhci_pci0
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
acpi_button0: <Power Button> on acpi0
acpi_tz0: <Thermal Zone> on acpi0
ns8250: UART FCR is broken
ns8250: UART FCR is broken
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
ns8250: UART FCR is broken
uart0: console (115200,n,8,1)
orm0: <ISA Option ROMs> at iomem 0xd0000-0xd0fff,0xd1000-0xd1fff,0xd2000-0xd2fff,0xd3000-0xd3fff pnpid ORM0000 on isa0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
Timecounter "TSC" frequency 1996800535 Hz quality 1000
Timecounters tick every 1.000 msec
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
ugen0.1: <Intel XHCI root HUB> at usbus0
uhub0 on usbus0
uhub0: <Intel XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
hdacc0: <Intel Gemini Lake HDA CODEC> at cad 2 on hdac0
hdaa0: <Intel Gemini Lake Audio Function Group> at nid 1 on hdacc0
pcm0: <Intel Gemini Lake (HDMI/DP 8ch)> at nid 3 on hdaa0
mmcsd0: 8GB <MMCHC 8GTF4R 0.6 SN 069293CA MFG 07/2023 by 21 0x0000> at mmc0 200.0MHz/8bit/8192-block
mmcsd0boot0: 4MB partition 1 at mmcsd0
mmcsd0boot1: 4MB partition 2 at mmcsd0
mmcsd0rpmb: 524kB partition 3 at mmcsd0
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <SSD 64GB W1120A0> ACS-2 ATA SATA 3.x device
ada0: Serial Number SSD00000000000001222
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 61057MB (125045424 512 byte sectors)
Trying to mount root from zfs:zroot/ROOT/default []...
uhub0: 16 ports with 16 removable, self powered
Dual Console: Video Primary, Serial Secondary
igb0: link state changed to UP
igb1: link state changed to UP
igb2: link state changed to UP
igb3: link state changed to UP
ichsmb0: <Intel Gemini Lake SMBus controller> port 0xf040-0xf05f mem 0xa1516000-0xa15160ff irq 20 at device 31.1 on pci0
smbus0: <System Management Bus> on ichsmb0
lo0: link state changed to UP
pflog0: permanently promiscuous mode enabled
igb1: link state changed to DOWN
vlan0: changing name to 'vlan0.250'
vlan1: changing name to 'vlan0.251'
vlan2: changing name to 'vlan0.901'
igb3: link state changed to DOWN
vlan3: changing name to 'vlan0.902'
igb0: link state changed to DOWN
igb2: link state changed to DOWN
igb1: link state changed to UP
vlan0.251: link state changed to UP
vlan0.250: link state changed to UP
igb3: link state changed to UP
vlan0.902: link state changed to UP
vlan0.901: link state changed to UP
igb0: link state changed to UP
igb2: link state changed to UP
wg0: changing name to 'wg1'
wg1: link state changed to UP
ipsec0: changing name to 'ipsec1'
tun1: changing name to 'ovpns1'
ovpns1: link state changed to UP
arp: 192.168.1.102 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.102 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.102 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.102 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.102 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.102 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.102 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.102 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.102 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.102 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.72 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.72 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.72 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.72 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.72 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.72 moved from bc:24:11:8f:a8:71 to bc:24:11:c8:18:bb on igb0
arp: 192.168.1.72 moved from bc:24:11:c8:18:bb to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.245 moved from ea:0f:29:66:78:03 to bc:24:11:8f:a8:71 on igb0
arp: 192.168.1.244 moved from 06:70:41:2d:5a:16 to bc:24:11:c8:18:bb on igb0





And here is my system log from the same time:

[code]RENEW on igb2 executing
<13>1 2024-12-20T05:00:57-05:00 firewall.local dhclient 91753 - [meta sequenceId="2"] dhclient-script: New Hostname (igb2): firewall
<13>1 2024-12-20T05:00:57-05:00 firewall.local dhclient 93297 - [meta sequenceId="3"] dhclient-script: Creating resolv.conf
<13>1 2024-12-20T05:15:57-05:00 firewall.local dhclient 41740 - [meta sequenceId="1"] dhclient-script: Reason RENEW on igb2 executing
<13>1 2024-12-20T05:15:57-05:00 firewall.local dhclient 42298 - [meta sequenceId="2"] dhclient-script: New Hostname (igb2): firewall
<13>1 2024-12-20T05:15:57-05:00 firewall.local dhclient 43104 - [meta sequenceId="3"] dhclient-script: Creating resolv.conf
<13>1 2024-12-20T05:20:21-05:00 firewall.local opnsense 25562 - [meta sequenceId="1"] /usr/local/etc/rc.routing_configure: ROUTING: entering configure using defaults
<13>1 2024-12-20T05:20:21-05:00 firewall.local opnsense 25562 - [meta sequenceId="2"] /usr/local/etc/rc.routing_configure: ROUTING: configuring inet default gateway on opt4
<13>1 2024-12-20T05:20:21-05:00 firewall.local opnsense 25562 - [meta sequenceId="3"] /usr/local/etc/rc.routing_configure: ROUTING: keeping inet default route to 32.218.108.1
<11>1 2024-12-20T05:20:21-05:00 firewall.local opnsense 25562 - [meta sequenceId="4"] /usr/local/etc/rc.routing_configure: The command '/sbin/route add -'inet' '172.16.7.253' -interface 'ipsec1'' returned exit code '1', the output was 'add host 172.16.7.253: gateway ipsec1 fib 0: route already in table'
<13>1 2024-12-20T05:20:21-05:00 firewall.local opnsense 25562 - [meta sequenceId="5"] /usr/local/etc/rc.routing_configure: plugins_configure monitor (1,[])
<13>1 2024-12-20T05:20:21-05:00 firewall.local opnsense 25562 - [meta sequenceId="6"] /usr/local/etc/rc.routing_configure: plugins_configure monitor (execute task : dpinger_configure_do(1,[]))
<13>1 2024-12-20T05:44:59-05:00 firewall.local dhclient 42974 - [meta sequenceId="1"] dhclient-script: Reason RENEW on igb2 executing
<13>1 2024-12-20T05:44:59-05:00 firewall.local dhclient 43826 - [meta sequenceId="2"] dhclient-script: New Hostname (igb2): firewall
<13>1 2024-12-20T05:44:59-05:00 firewall.local dhclient 44736 - [meta sequenceId="3"] dhclient-script: Creating resolv.conf
<13>1 2024-12-20T05:58:34-05:00 firewall.local configctl 37941 - [meta sequenceId="1"] event @ 1734692313.93 msg: Dec 20 05:58:33 firewall.local config[61557]: config-event: new_config /conf/backup/config-1734692313.9114.xml
<13>1 2024-12-20T05:58:34-05:00 firewall.local configctl 37941 - [meta sequenceId="2"] event @ 1734692313.93 exec: system event config_changed response: OK 
<13>1 2024-12-20T05:58:38-05:00 firewall.local configctl 37941 - [meta sequenceId="3"] event @ 1734692317.63 msg: Dec 20 05:58:37 firewall.local config[44535]: config-event: new_config /conf/backup/config-1734692317.6048.xml


Like I said, this was happening on my old HP T620plus.  I figured it was time for a replacement so I installed the Protectcli and the same thing is happening.  Any suggestions?  Ive tried moving the ONT/Gateways to be on a WAN switch with the firewall, that didn't change anything.  Replaced the cables etc. 

I have the same issue, I have an opnsense box I setup so I can start slowly configuring it and eventually switch it into production. But if it sits for a few days, it just dies. Can ping it, but can't do anything else.

It's too bad, because I'm trying to upgrade from Pfsense and upgrade to new hardware at same time, but so far I can't trust this for production.

I'm seeing the same too. Been running OPNsense for about a year in a Proxmox environment and it's been flawless. I'm pretty sure it's since the last system update (I update on a daily basis). I have OPNSense in a 2 node High Availability setup on two different proxmox servers and both firewalls die at similar time. They are 'up' but the CPU is so busy the firewalls stop passing traffic.

I'm pinning all my hopes on it being fixed in the next update.

Interesting, do you happen to be running ubiquiti switches at all?

I have a theory that it's related to version 7.x of the ubiquiti firmware. 

Nope, even downgrading the Ubiquiti switch didnt fix it.  Can anyone tell me what logs etc I should be looking at to troubleshoot?  Struggling here.

So this is odd, I used the same hardware, and essentially the same config using pfsense and the latency/spikes went away.  Is there anything I can do to troubleshoot this?  Would rather use OPNSENSE.  Here is an example graph, you can see where I installed pfsense and it went away:

https://imgur.com/a/AumpygT

And this is the latency when pinging the firewall itself:

https://imgur.com/a/k66aonU

Are you using dual stack ? I've seen odd behavior with over 1 second latency on uploads in bufferbloat tests but they seemed to be ISP specific - i.e. same configuration and HW would run much better on another ISP. The problematic ISP is IPv4 only with IPv6 done on OPNsense through HE, while the other one is a dual stack ISP with DHCPv6.