OPNsense DEC2750 [24.10.2]: Constant crashes

Started by if8ps3Jc, February 21, 2025, 01:43:35 AM

Previous topic - Next topic
February 21, 2025, 01:43:35 AM Last Edit: February 21, 2025, 01:48:01 AM by if8ps3Jc
Hi all
I've had this issue back last year for about too months (August until ~ October) and then it suddenly disappeared - now it seems back:
One of the firewalls I manage is constantly crashing (currently about 2-5 times an hour!) The Firewall is in a productive environment where a stable internet connection is crucial. The warranty is not yet expired for this device if there is a hardware issue.


System Information:

User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:135.0) Gecko/20100101 Firefox/135.0
FreeBSD 14.1-RELEASE-p7 stable/24.7-n268020-d553534fe81 SMP amd64
OPNsense 24.10.2 e381f0d04
Plugins os-OPNBEcore-1.4_4 os-ddclient-1.26 os-etpro-telemetry-1.7_5 os-nextcloud-backup-1.0_1
Time Fri, 21 Feb 2025 01:17:48 +0100
OpenSSL 3.0.15
Python 3.11.11
PHP 8.2.27

Logs:

<118>Thu Feb 20 20:35:39 CET 2025
<118>
<118>*** <REDACTED>: OPNsense 24.10.1 (amd64) ***
<118>
<118> ALL_LAN (bridge0) -> v4: <REDACTED>
<118> LAN_1 (igb1)    ->
<118> LAN_2 (igb2)    ->
<118> SFTP_LAN_1 (ax0) ->
<118> SFTP_LAN_2 (ax1) ->
<118> WAN (igb0)      -> v4/DHCP4: <REDACTED>
<118>
<118> HTTPS: sha256 <REDACTED>
<118>               <REDACTED>
236.579097 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
236.587447 [1072] generic_netmap_dtor       Emulated netmap adapter for bridge0 destroyed
236.595348 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
236.603827 [1072] generic_netmap_dtor       Emulated netmap adapter for bridge0 destroyed
236.611749 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
236.620217 [1072] generic_netmap_dtor       Emulated netmap adapter for bridge0 destroyed
<6>bridge0: permanently promiscuous mode enabled
236.630542 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
236.715671 [ 319] generic_netmap_register   Emulated adapter for bridge0 activated
236.738936 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
236.746344 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
236.753875 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
<6>igb0: permanently promiscuous mode enabled
236.763865 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
<6>igb0: link state changed to DOWN
237.904601 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
237.912010 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
237.919542 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
<6>igb0: link state changed to UP
<6>igb0: permanently promiscuous mode disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address    = 0x28
fault code        = supervisor read data, page not present
instruction pointer    = 0x20:0xffffffff80d052f4
stack pointer            = 0x28:0xfffffe00b1080690
frame pointer            = 0x28:0xfffffe00b10806b0
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process        = 85006 (pfctl)
rdi: fffff8017b23ce60 rsi: fffffe00b10806c8 rdx: 0000000000000000
rcx: fffff8017b23ce60  r8: 0000000000000000  r9: 8080808080808080
rax: 0000000000000000 rbx: fffffe00b10806c8 rbp: fffffe00b10806b0
r10: fffff80173a54000 r11: ffffffffffffffff r12: 0000000000000000
r13: fffffe00c8482000 r14: ffffffff821edeb0 r15: fffff8017b23ce60
trap number        = 12
panic: page fault
cpuid = 2
time = 1740080523
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00b1080380
vpanic() at vpanic+0x131/frame 0xfffffe00b10804b0
panic() at panic+0x43/frame 0xfffffe00b1080510
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00b1080570
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00b10805c0
calltrap() at calltrap+0x8/frame 0xfffffe00b10805c0
--- trap 0xc, rip = 0xffffffff80d052f4, rsp = 0xfffffe00b1080690, rbp = 0xfffffe00b10806b0 ---
rn_walktree() at rn_walktree+0x54/frame 0xfffffe00b10806b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00b1080710
pfioctl() at pfioctl+0x221e/frame 0xfffffe00b1080bf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00b1080c40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00b1080cb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00b1080cd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00b1080d40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00b1080e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00b1080f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00b1080f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x14d360b785fa, rsp = 0x14d35c65b028, rbp = 0x14d35c65b4c0 ---
KDB: enter: panic
panic.txt0600001214755702613  7143 ustarrootwheelpage faultversion.txt0600007414755702613  7546 ustarrootwheelFreeBSD 14.1-RELEASE-p6 stable/24.7-n267939-fd5bc7f34e1 SMP

I have been looking into the rn_walktree() / pfr_get_addrs() panic combo for a bit and it looks like this affects all CPU types and dates back to at least 23.x when we were still deep in FreeBSD 13 territory. At one point the idea was that mdns-repeater plays a role but you don't have that and other statistic data shows that this is not easily traceable to a particular software.

The crash can occur when "pfctl -t tablename -T show" is executed. It never hints a different code path to rn_walktree() which could mean that's the only buggy one or the others are much harder to crash. To stress that theory there is a test patch that makes the lock into pfr_get_addrs() exclusive instead of shared between readers:

https://github.com/opnsense/src/commit/2dfe3735c8eeb

On 24.10 you can install this with

# opnsense-update -zkr 24.7.13-radix
(reboot)

On 25.1 uou can install this with

# opnsense-update -zkr 25.1.1-radix
(reboot)

Try the kernel and let us know if the crash keeps happening or is gone.

Unrelated memory corruption panics occur on the same affected systems so this could be nothing, but OTOH the code path here sticks out very prominently so it could point to the problem source in the code.


Thanks,
Franco

hi there - thanks for your fast reply :)

I've just installed the patch and rebooted the system:
opnsense-update -zkr 24.7.13-radix
Fetching kernel-24.7.13-radix-amd64.txz: .... done
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Installing kernel-24.7.13-radix-amd64.txz... done
Please reboot.
Let's see how the next few hours go :)

It took only like 5 minutes until the next crash occurred.

System Information:

User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:135.0) Gecko/20100101 Firefox/135.0
FreeBSD 14.1-RELEASE-p8 pf_wlock2-n268025-e0101f20aee SMP amd64
OPNsense 24.10.2 e381f0d04
Plugins os-OPNBEcore-1.4_4 os-ddclient-1.26 os-etpro-telemetry-1.7_5 os-nextcloud-backup-1.0_1
Time Fri, 21 Feb 2025 17:17:21 +0100
OpenSSL 3.0.15
Python 3.11.11
PHP 8.2.27

Here some corresponding dmesg.boot error logs:

---<<BOOT>>---
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.1-RELEASE-p8 pf_wlock2-n268025-e0101f20aee SMP amd64
FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)
VT(vga): resolution 640x480
CPU: AMD Ryzen Embedded V1500B                       (2196.03-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x810f10  Family=0x17  Model=0x11  Stepping=0
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
  AMD Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
  Structured Extended Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr,IBPB>
  SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
  TSC: P-state invariant, performance statistics
real memory  = 8589934592 (8192 MB)
avail memory = 8223481856 (7842 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <INSYDE EDK2    >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
Firmware Warning (ACPI): 32/64X length mismatch in FADT/Gpe0Block: 64/8 (20221020/tbfadt-748)
ioapic0: MADT APIC ID 33 != hw id 0
ioapic1: MADT APIC ID 34 != hw id 0
ioapic0 <Version 2.1> irqs 0-23
ioapic1 <Version 2.1> irqs 24-55
Launching APs: 2 1 3
random: entropy device external interface
wlan: mac acl policy registered
kbd0 at kbdmux0
WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 15.0.
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
vtvga0: <VT VGA driver>
smbios0: <System Management BIOS> at iomem 0xce155000-0xce15501e
smbios0: Version: 3.1, BCD Revision: 3.1
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256>
acpi0: <INSYDE EDK2>
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 450
Event timer "HPET2" frequency 14318180 Hz quality 450
atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.1 on pci0
pci1: <ACPI PCI bus> on pcib1
nvme0: <Generic NVMe Device> mem 0xd0900000-0xd0903fff at device 0.0 on pci1
pcib2: <ACPI PCI-PCI bridge> at device 1.2 on pci0
pci2: <ACPI PCI bus> on pcib2
igb0: <Intel(R) I210 Flashless (Copper)> port 0x3000-0x301f mem 0xd0800000-0xd081ffff,0xd0820000-0xd0823fff at device 0.0 on pci2
igb0: NVM V0.6 imgtype6
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 4 RX queues 4 TX queues
igb0: Using MSI-X interrupts with 5 vectors
<6>igb0: Ethernet address: <REDACTED<
<6>igb0: netmap queues/slots: TX 4/1024, RX 4/1024
pcib3: <ACPI PCI-PCI bridge> at device 1.3 on pci0
pci3: <ACPI PCI bus> on pcib3
igb1: <Intel(R) I210 Flashless (Copper)> port 0x2000-0x201f mem 0xd0700000-0xd071ffff,0xd0720000-0xd0723fff at device 0.0 on pci3
igb1: NVM V0.6 imgtype6
igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 4 RX queues 4 TX queues
igb1: Using MSI-X interrupts with 5 vectors
<6>igb1: Ethernet address: <REDACTED>
<6>igb1: netmap queues/slots: TX 4/1024, RX 4/1024
pcib4: <ACPI PCI-PCI bridge> at device 1.4 on pci0
pci4: <ACPI PCI bus> on pcib4
igb2: <Intel(R) I210 Flashless (Copper)> port 0x1000-0x101f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff at device 0.0 on pci4
igb2: NVM V0.6 imgtype6
igb2: Using 1024 TX descriptors and 1024 RX descriptors
igb2: Using 4 RX queues 4 TX queues
igb2: Using MSI-X interrupts with 5 vectors
<6>igb2: Ethernet address: <REDACTED>
<6>igb2: netmap queues/slots: TX 4/1024, RX 4/1024
pcib5: <ACPI PCI-PCI bridge> at device 8.1 on pci0
pci5: <ACPI PCI bus> on pcib5
pci5: <encrypt/decrypt> at device 0.2 (no driver attached)
xhci0: <XHCI (generic) USB 3.0 controller> mem 0xd0300000-0xd03fffff at device 0.3 on pci5
xhci0: 64 bytes context size, 64-bit DMA
usbus0: waiting for BIOS to give up control
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
xhci1: <XHCI (generic) USB 3.0 controller> mem 0xd0200000-0xd02fffff at device 0.4 on pci5
xhci1: 64 bytes context size, 64-bit DMA
usbus1: waiting for BIOS to give up control
usbus1 on xhci1
usbus1: 5.0Gbps Super Speed USB v3.0
pci5: <multimedia> at device 0.5 (no driver attached)
hdac0: <AMD Raven HDA Controller> mem 0xd0540000-0xd0547fff at device 0.6 on pci5
pci5: <old, non-VGA display device> at device 0.7 (no driver attached)
pcib6: <ACPI PCI-PCI bridge> at device 8.2 on pci0
pci6: <ACPI PCI bus> on pcib6
ax0: <AMD 10 Gigabit Ethernet Driver> mem 0xd0060000-0xd007ffff,0xd0040000-0xd005ffff,0xd0082000-0xd0083fff at device 0.1 on pci6
ax0: Using 2048 TX descriptors and 2048 RX descriptors
ax0: Using 3 RX queues 3 TX queues
ax0: Using MSI-X interrupts with 7 vectors
<6>ax0: Ethernet address: <REDACTED>
ax0: xgbe_config_sph_mode: SPH disabled in channel 0
ax0: xgbe_config_sph_mode: SPH disabled in channel 1
ax0: xgbe_config_sph_mode: SPH disabled in channel 2
ax0: RSS Enabled
ax0: Receive checksum offload Enabled
ax0: VLAN filtering Enabled
ax0: VLAN Stripping Enabled
ax0: Checking GPIO expander validity
ax0: GPIO configuration valid
ax0: xgbe_phy_sfp_signals: port_sfp_inputs: 0x7
ax0: xgbe_phy_sfp_detect: mod absent
<6>ax0: netmap queues/slots: TX 3/2048, RX 3/2048
ax1: <AMD 10 Gigabit Ethernet Driver> mem 0xd0020000-0xd003ffff,0xd0000000-0xd001ffff,0xd0080000-0xd0081fff at device 0.2 on pci6
ax1: Using 2048 TX descriptors and 2048 RX descriptors
ax1: Using 3 RX queues 3 TX queues
ax1: Using MSI-X interrupts with 7 vectors
<6>ax1: Ethernet address: <REDACTED>
ax1: xgbe_config_sph_mode: SPH disabled in channel 0
ax1: xgbe_config_sph_mode: SPH disabled in channel 1
ax1: xgbe_config_sph_mode: SPH disabled in channel 2
ax1: RSS Enabled
ax1: Receive checksum offload Enabled
ax1: VLAN filtering Enabled
ax1: VLAN Stripping Enabled
ax1: xgbe_phy_rx_reset: firmware mailbox reset performed
ax1: Checking GPIO expander validity
ax1: GPIO configuration valid
ax1: xgbe_phy_sfp_signals: port_sfp_inputs: 0x7
ax1: xgbe_phy_sfp_detect: mod absent
<6>ax1: netmap queues/slots: TX 3/2048, RX 3/2048
isab0: <PCI-ISA bridge> at device 20.3 on pci0
isa0: <ISA bus> on isab0
uart2: <16x50 with 256 byte FIFO> iomem 0xfedc9000-0xfedc9fff,0xfedc7000-0xfedc7fff irq 3 on acpi0
ns8250: UART FCR is broken
uart2: console (115384,n,8,1)
hwpstate0: <Cool`n'Quiet 2.0> on cpu0
Timecounter "TSC-low" frequency 1097938286 Hz quality 1000
Timecounters tick every 1.000 msec
ugen1.1: <AMD XHCI root HUB> at usbus1
ugen0.1: <AMD XHCI root HUB> at usbus0
uhub0 on usbus1
uhub1 on usbus0
uhub1: <AMD XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
uhub0: <AMD XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus1
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
nda0 at nvme0 bus 0 scbus0 target 0 lun 1
nda0: <REDACTED>
nda0: Serial Number <REDACTED>
nda0: nvme version 1.4
nda0: 244198MB (500118192 512 byte sectors)
Trying to mount root from zfs:zroot/ROOT/default []...
uhub0: 3 ports with 3 removable, self powered
uhub1: 8 ports with 8 removable, self powered
<118>Mounting filesystems...
<6>pid 30 (zpool) is attempting to use unsafe AIO requests - not logging anymore
<118>no pools available to import
<118>Setting hostuuid: <REDACTED>.
<118>Setting hostid: 0x<REDACTED>.
<118>Configuring crash dump device: /dev/gpt/swapfs
<118>swapon: adding /dev/gpt/swapfs as swap device
<118>.ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg /usr/local/lib/compat/pkg /usr/local/lib/ipsec /usr/local/lib/perl5/5.36/mach/CORE
<118>32-bit compatibility ldconfig path:
<118>done.
<118>>>> Invoking early script 'upgrade'
<118>>>> Invoking early script 'configd'
<118>Starting configd.
<118>>>> Invoking early script 'templates'
<118>Generating configuration: OK
<118>>>> Invoking early script 'backup'
<118>>>> Invoking backup script 'captiveportal'
<118>>>> Invoking backup script 'dhcpleases'
<118>>>> Invoking backup script 'duid'
<118>>>> Invoking backup script 'netflow'
<118>>>> Invoking backup script 'rrd'
<118>>>> Invoking early script 'carp'
<118>CARP event system: OK
<118>Launching the init system...done.
<118>Initializing..........done.
<6>igb0: link state changed to UP
<6>igb1: link state changed to UP
<118>Starting device manager...
intsmb0: <AMD FCH SMBus Controller> at device 20.0 on pci0
smbus0: <System Management Bus> on intsmb0
driver bug: Unable to set devclass (class: ppc devname: (unknown))
ig4iic0: <Designware I2C Controller> iomem 0xfedc4000-0xfedc4fff irq 14 on acpi0
iicbus0: <Philips I2C bus (ACPI-hinted)> on ig4iic0
<118>done.
<118>Configuring login behaviour...done.
<118>Configuring loopback interface...
<6>lo0: link state changed to UP
<118>done.
<118>Configuring kernel modules...
amdsmn0: <AMD Family 17h System Management Network> on hostb0
amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb0
<118>done.
<118>Setting up extended sysctls...done.
<118>Setting timezone: Europe/Zurich
<118>Writing firmware settings: FreeBSD OPNsense
<118>Writing trust files...done.
<118>Scanning /usr/share/certs/untrusted for certificates...
<118>Scanning /usr/share/certs/trusted for certificates...
<118>Scanning /usr/local/share/certs for certificates...
<118>certctl: No changes to trust store were made.
<118>Writing trust bundles...done.
<118>Setting hostname: <REDACTED>
<118>Generating /etc/resolv.conf...done.
<118>Generating /etc/hosts...done.
<118>Configuring system logging...done.
<118>Configuring firewall.......
<6>pflog0: permanently promiscuous mode enabled
<118>done.
<118>Configuring hardware interfaces...done.
<118>Configuring loopback interface...done.
<118>Configuring LAGG interfaces...done.
<118>Configuring VLAN interfaces...done.
<118>Configuring LAN_1 interface...
<6>igb1: link state changed to DOWN
<118>done.
<118>Configuring LAN_2 interface...done.
<118>Configuring SFTP_LAN_1 interface...
ax0: xgbe_config_sph_mode: SPH disabled in channel 0
ax0: xgbe_config_sph_mode: SPH disabled in channel 1
ax0: xgbe_config_sph_mode: SPH disabled in channel 2
ax0: RSS Enabled
ax0: Receive checksum offload Disabled
ax0: VLAN filtering Disabled
ax0: VLAN Stripping Disabled
<118>done.
<118>Configuring SFTP_LAN_2 interface...
ax1: xgbe_config_sph_mode: SPH disabled in channel 0
ax1: xgbe_config_sph_mode: SPH disabled in channel 1
ax1: xgbe_config_sph_mode: SPH disabled in channel 2
ax1: RSS Enabled
ax1: Receive checksum offload Disabled
ax1: VLAN filtering Disabled
ax1: VLAN Stripping Disabled
<118>done.
<118>Configuring WAN interface...
<6>igb0: link state changed to DOWN
<6>igb1: link state changed to UP
<6>igb0: link state changed to UP
<118>done.
<6>bridge0: Ethernet address: <REDACTED>
<6>bridge0: link state changed to UP
<6>igb1: promiscuous mode enabled
<6>igb2: promiscuous mode enabled
<6>ax0: promiscuous mode enabled
<6>ax1: promiscuous mode enabled
<118>Configuring ALL_LAN interface...done.
<6>tun1: changing name to 'ovpns1'
<118>Generating /etc/resolv.conf...done.
<118>Generating /etc/hosts...done.
<118>Configuring firewall.......done.
<118>Configuring OpenSSH...done.
<118>Starting web GUI...done.
<118>Setting up routes...done.
<118>Starting DHCPv4 service...done.
<118>Starting Unbound DNS...done.
<118>Configuring firewall.......done.
<118>Setting up gateway monitor...done.
<118>Syncing OpenVPN settings...done.
<118>Starting NTP service...done.
<118>Starting Unbound DNS...done.
<118>Starting power daemon...done.
<118>>>> Invoking start script 'newwanip'
<118>Reconfiguring IPv4 on igb0
<118>>>> Invoking start script 'freebsd'
<118>Starting ddclient_opn.
<118>Starting monit.
<118>Starting Monit 5.34.3 daemon with http interface at /var/run/monit.sock
<118>Starting suricata.
<118>␛[33mInfo␛[0m: ␛[32mconf-yaml-loader␛[0m: Including configuration file installed_rules.yaml.␛[0m
<118>␛[33mInfo␛[0m: ␛[32mconf-yaml-loader␛[0m: Configuration node 'rule-files' redefined.␛[0m
<118>␛[33mInfo␛[0m: ␛[32mconf-yaml-loader␛[0m: Including configuration file custom.yaml.␛[0m
<118>>>> Invoking start script 'syslog'
<118>>>> Invoking start script 'carp'
<118>>>> Invoking start script 'cron'
<118>Starting Cron: OK
<118>
<118>
<118>>>> Invoking start script 'openvpn'
<118>>>> Invoking start script 'sysctl'
<118>Service `sysctl' has been restarted.
<118>>>> Invoking start script 'beep'
<6>pid 24185 (unbound), jid 0, uid 59: exited on signal 11 (no core dump - bad address)
<6>ovpns1: link state changed to UP
<118>Root file system: zroot/ROOT/default
<118>Fri Feb 21 17:10:24 CET 2025
<118>
<118>*** <REDACTED>: OPNsense 24.10.2 (amd64) ***
<118>
<118> ALL_LAN (bridge0) -> v4: <REDACTED>
<118> LAN_1 (igb1)    ->
<118> LAN_2 (igb2)    ->
<118> SFTP_LAN_1 (ax0) ->
<118> SFTP_LAN_2 (ax1) ->
<118> WAN (igb0)      -> v4/DHCP4: <REDACTED>
<118>
<118> HTTPS: sha256 <REDACTED>
<118>               <REDACTED>
<118> SSH:   SHA256 <REDACTED> (ECDSA)
<118> SSH:   SHA256 <REDACTED> (ED25519)
<118> SSH:   SHA256 <REDACTED> (RSA)
323.302187 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
323.310542 [1072] generic_netmap_dtor       Emulated netmap adapter for bridge0 destroyed
323.318444 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
323.326914 [1072] generic_netmap_dtor       Emulated netmap adapter for bridge0 destroyed
323.334811 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
323.343302 [1072] generic_netmap_dtor       Emulated netmap adapter for bridge0 destroyed
<6>bridge0: permanently promiscuous mode enabled
323.353871 [1167] generic_netmap_attach     Emulated adapter for bridge0 created (prev was NULL)
323.447491 [ 319] generic_netmap_register   Emulated adapter for bridge0 activated
323.471196 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
323.478608 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
323.486136 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
<6>igb0: permanently promiscuous mode enabled
323.495751 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
<6>igb0: link state changed to DOWN
324.621338 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
324.628747 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
324.636283 [ 852] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
<6>igb0: link state changed to UP
<6>igb0: permanently promiscuous mode disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0x2be9baec05d0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80d053a7
stack pointer         = 0x28:0xfffffe00aab72690
frame pointer         = 0x28:0xfffffe00aab726b0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 97656 (pfctl)
rdi: fffff80143aafaa0 rsi: fffffe00aab726c8 rdx: 0000000000000000
rcx: fffff80143aafc10  r8: 0000000000000000  r9: 8080808080808080
rax: 00002be9baec05e0 rbx: fffffe00aab726c8 rbp: fffffe00aab726b0
r10: fffff8011377f000 r11: ffffffffffffffff r12: 0000000000000000
r13: fffffe00c9800000 r14: ffffffff821ac000 r15: 00002be9baec05c0
trap number = 12
panic: page fault
cpuid = 1
time = 1740154441
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00aab72380
vpanic() at vpanic+0x131/frame 0xfffffe00aab724b0
panic() at panic+0x43/frame 0xfffffe00aab72510
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00aab72570
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00aab725c0
calltrap() at calltrap+0x8/frame 0xfffffe00aab725c0
--- trap 0xc, rip = 0xffffffff80d053a7, rsp = 0xfffffe00aab72690, rbp = 0xfffffe00aab726b0 ---
rn_walktree() at rn_walktree+0x77/frame 0xfffffe00aab726b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00aab72710
pfioctl() at pfioctl+0x2215/frame 0xfffffe00aab72bf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00aab72c40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00aab72cb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00aab72cd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00aab72d40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00aab72e00
amd64_syscall() at amd64_syscall+0xf9/frame 0xfffffe00aab72f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00aab72f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x1371524965fa, rsp = 0x13714e83ff68, rbp = 0x13714e840400 ---
KDB: enter: panic
panic.txt0600001214756123111  7132 ustarrootwheelpage faultversion.txt0600007214756123111  7533 ustarrootwheelFreeBSD 14.1-RELEASE-p8 pf_wlock2-n268025-e0101f20aee SMP

Quotepfctl -t tablename -T show

Ran this multiple times on 750 and other FWs - against the largest table I have - and can't get it to crash.

@if8ps3Jc would you mind providing a vmcore file via debug kernel? So the issue is structural, not coincidental...

It only needs the debug kernel, a reboot and a crash to render the file under /var/crash/vmcore.0

# opnsense-update -zkr dbg-24.7.13


Thanks,
Franco

Hi @franco
Sure, I just submitted a crash report via OPNsense GUI, I think the vmcore file is included, right? It was at least mentioned there ("File too big to display, will be submitted automatically...")

vmcore is not submitted, but it will probably be removed after report... (the information is sent as seen)


Small update: been working with if8ps3Jc on this basically all of last week to get closer. There is a corruption in the rules engine table, but we did not pinpoint it yet since we can't get a usable vmcore out of these crashes and we're observing the corrupted access but it's unclear where the corruption occurs. The theory is that this is in the traffic matching end which would also explain that it's hard to catch/reproduce.

To be continued...


Cheers,
Franco