Hi,
Unfortunately, my OPNSense firewall has been randomly crashing for some time now. I cannot predict a pattern. Sometimes it happens after a week or two, sometimes within 24 hours. Mostly while low traffic (normal websurfing)
At first I thought it might be the combination of RAM disk and firewall logs, however the crashes continue to occur even after deactivation.
CPU temperatures seems normal.
Hardware/Configuration:
OPNsense 22.1.8_1-amd64
Sophos XG 105
Intel Atom Processor E3930 @ 1.30GHz (2 cores, 2 threads)
2048MB RAM
4x Intel I211
64GB SSD (ZFS)
No CARP or IPS in use.
Installed Plugins:
os-acme-client
os-ddclient
os-dmidecode
os-dyndns
os-git-backup
os-hw-probe
os-iperf
os-mdns-repeater
os-smart
os-telegraf
os-theme-cicada
os-udpbroadcastrelay
os-vnstat
os-wireguard (+ kmod)
Following tunables modified:
hw.ibrs_disable = 1
hw.igb.rx_process_limit = -1
hw.igb.tx_process_limit = -1
hw.mds_disable = 0
hw.pci.honor_msi_blacklist = 0
legal.intel_igb.license_ack = 1
net.inet.icmp.drop_redirect = 1
net.inet.ip.redirect = 0
vfs.zfs.arc_max = 256M
vm.pmap.pti = 0
I was able to record a crash message from the serial console. Unfortunately i cannot post it into this message due to the character limit, but i uploaded it on my pastebin service and attached it as a file to this post.
https://paste.biocrafting.net/?ce2a1af0e2c5d868#FZUKBAbbQVpNkTaEyVsvc979ggYSfitZFNvNfZYR2njW
Has somebody any idea what can causes the crashes?
Best regards
It looks hardware-related :
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0003b69db0
vpanic() at vpnic+0x17f/frame 0xfffffe0003b69e00
panic() at panic+0x43/frame 0xfffffe0003b69e60
dblfault_handler() at dblfault_handler+0x1ce/frame 0xfffffe0003b69f20
Xdblfault() at Xdblfault+0xd7/frame 0xfffffe0003b69f20
--- trap 0x17, rip = 0xffffffff8110d1d6, rsp = 0xfffffe0003571dd0, rbp = 0xfffffe0003571dd0 ---
acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfffffe0003571dd0
acpi_cpu_idle() at acpi_cpu_idle+0x2ef/frame 0xfffffe0003571e10
cpu_idle_acpi() at cpu_idle_acpi+0x3e/frame 0xfffffe0003571e30
cpu_idle() at cpu_idle+0x9f/frame 0xfffffe0003571e50
sched_idletd() at sched_idletd+0x4e1/frame 0xfffffe0003571ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe0003571f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0003571f30
--- trap 0x36200d0, rip = 0xffffffff80c2b91f, rsp = 0, rbp = 0xffffffff8131d1ea ---
mi_startup() at mi_startup+0xdf/frame 0xffffffff8131d1ea
Maybe a BIOS update can help here.
Cheers,
Franco
With a similar system, I get those kinds of instabilities when lower C-states are allowed. You can look at 'sysctl -a | grep cx_' to find out which C1-states are allowed and which are in use. You can set 'sysctl hw.acpi.cpu.cx_lowest=CX' or set the tuneable to limit lowest C-state to X. My box is capable of doing C1 only before getting unstable.
Thanks both of you for your input.
I have to see how to update the BIOS on this appliance. Sophos do not provide standalone update files, maybe it automatically updates when XG is installed. I have to try it.
I checked the c-states and it seems that the CPU only supports C0/1?
root@FWOPS01DEL:~ # sysctl -a | grep cx_
hw.acpi.cpu.cx_lowest: C1
dev.cpu.1.cx_method: C1/hlt
dev.cpu.1.cx_usage_counters: 130821134
dev.cpu.1.cx_usage: 100.00% last 89us
dev.cpu.1.cx_lowest: C1
dev.cpu.1.cx_supported: C1/1/0
dev.cpu.0.cx_method: C1/hlt
dev.cpu.0.cx_usage_counters: 251320055
dev.cpu.0.cx_usage: 100.00% last 32us
dev.cpu.0.cx_lowest: C1
dev.cpu.0.cx_supported: C1/1/0
It is a matter of the BIOS (sometimes configurable) which states are used, some vendors have problems with lower C-states. In your case, only C1 is supported (and used), so that this should not be a problem.
The last week the system ran well and stable, but today a crash occurred again. BIOS is unfortunately already the latest installed, because the firewall model is EOL and the last Sophos version XG 17.5.17 was already installed before.
But I took the chance and reinstalled the system completely and restored the config.xml, maybe the behavior improves. At that time I installed OPNSense manually over FreeBSD so I could use ZFS.