IPS mode in Suricata causes kernel panic

Started by kotashiratsuka, July 26, 2024, 02:48:21 PM

Previous topic - Next topic
Upgraded from 24.1 to 24.7 running on XCP-ng

Intrusion Detection is turned off or IPS mode in Suricata is unchecked and kernel panic does not occur

Fatal trap 12: page fault while in kernel mode
cpuid = 10; apic id = 14
fault virtual address = 0x30
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80a0f15f
stack pointer         = 0x28:0xfffffe0121e228e0
frame pointer         = 0x28:0xfffffe0121e22970
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 34811 (W#01-xn1^)
rdi: fffff80006de9000 rsi: fffff80008746300 rdx: fffff80008746300
rcx: fffff80004175800  r8: 00000000000000f8  r9: 00000ac9ea080000
rax: 00000000000000ff rbx: fffffe006d2f3000 rbp: fffffe0121e22970
r10: de19cd6057e97e01 r11: fffff8000bb94c60 r12: 0000000000000000
r13: fffff80006300800 r14: fffffe0121e22944 r15: fffff80008746300
trap number = 12
panic: page fault
cpuid = 10
time = 1721996741
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0121e225d0
vpanic() at vpanic+0x131/frame 0xfffffe0121e22700
panic() at panic+0x43/frame 0xfffffe0121e22760
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0121e227c0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe0121e22810
calltrap() at calltrap+0x8/frame 0xfffffe0121e22810
--- trap 0xc, rip = 0xffffffff80a0f15f, rsp = 0xfffffe0121e228e0, rbp = 0xfffffe0121e22970 ---
xn_txq_mq_start_locked() at xn_txq_mq_start_locked+0xdf/frame 0xfffffe0121e22970
xn_txq_mq_start() at xn_txq_mq_start+0x76/frame 0xfffffe0121e229a0
nm_os_generic_xmit_frame() at nm_os_generic_xmit_frame+0xa0/frame 0xfffffe0121e229f0
generic_netmap_txsync() at generic_netmap_txsync+0x3a2/frame 0xfffffe0121e22ae0
netmap_ioctl() at netmap_ioctl+0x1a7/frame 0xfffffe0121e22bb0
freebsd_netmap_ioctl() at freebsd_netmap_ioctl+0x79/frame 0xfffffe0121e22bf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe0121e22c40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe0121e22cb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe0121e22cd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe0121e22d40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe0121e22e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe0121e22f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0121e22f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x82dcc75fa, rsp = 0x8466f7df8, rbp = 0x8466f7e20 ---
KDB: enter: panic
panic.txt0600001214650712705  7140 ustarrootwheelpage faultversion.txt0600007514650712705  7544 ustarrootwheelFreeBSD 14.1-RELEASE-p2 stable/24.7-n267758-4ad7ad40bc77 SMP
/var/crash/textdump.tar.1:


Thanks, I'm trying to pass this to the right authority.


Cheers,
Franco

July 27, 2024, 01:13:03 AM #2 Last Edit: July 27, 2024, 10:27:06 AM by teej1980uk
+1 for me also, running in AWS 24.7_5 on a t2.large. As soon as I disable IPS, the reloads no longer persist.

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x30
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80a0f15f
stack pointer         = 0x28:0xfffffe00f4ef18e0
frame pointer         = 0x28:0xfffffe00f4ef1970
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 93696 (W#01-xn0^)
rdi: fffff80004e75000 rsi: fffff80004cd1c00 rdx: fffff80004cd1c00
rcx: fffff80003e97c00  r8: 000000000000003d  r9: 0000000000000800
rax: 00000000000000ff rbx: fffffe00d917f000 rbp: fffffe00f4ef1970
r10: 0000000000000301 r11: fffff80271e38c60 r12: 0000000000000000
r13: fffff80003b43800 r14: fffffe00f4ef1944 r15: fffff80004cd1c00
trap number = 12
panic: page fault
cpuid = 1
time = 1722034585
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f4ef15d0
vpanic() at vpanic+0x131/frame 0xfffffe00f4ef1700
panic() at panic+0x43/frame 0xfffffe00f4ef1760
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00f4ef17c0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00f4ef1810
calltrap() at calltrap+0x8/frame 0xfffffe00f4ef1810
--- trap 0xc, rip = 0xffffffff80a0f15f, rsp = 0xfffffe00f4ef18e0, rbp = 0xfffffe00f4ef1970 ---
xn_txq_mq_start_locked() at xn_txq_mq_start_locked+0xdf/frame 0xfffffe00f4ef1970
xn_txq_mq_start() at xn_txq_mq_start+0x76/frame 0xfffffe00f4ef19a0
nm_os_generic_xmit_frame() at nm_os_generic_xmit_frame+0xa0/frame 0xfffffe00f4ef19f0
generic_netmap_txsync() at generic_netmap_txsync+0x3a2/frame 0xfffffe00f4ef1ae0
netmap_ioctl() at netmap_ioctl+0x1a7/frame 0xfffffe00f4ef1bb0
freebsd_netmap_ioctl() at freebsd_netmap_ioctl+0x79/frame 0xfffffe00f4ef1bf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00f4ef1c40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00f4ef1cb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00f4ef1cd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00f4ef1d40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00f4ef1e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00f4ef1f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00f4ef1f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x829f2e5fa, rsp = 0x844b94df8, rbp = 0x844b94e20 ---
KDB: enter: panic

Ok you posted the less relevant side of the report. The stack trace would be favourable which starts below "KDB: stack backtrace:" :)

My bad, apologies, I've updated my original post with the complete ouput.

Thanks, same one indeed. Will dig into it next week.


Cheers,
Franco

July 29, 2024, 07:42:23 AM #6 Last Edit: July 29, 2024, 07:46:49 AM by franco
Problem actually appears to be xen(4) in particular.

Will continue in https://github.com/opnsense/src/issues/211


Cheers,
Franco

Kernel to try:

# opnsense-update -zkr 24.7_7

Don't forget to reboot.


Cheers,
Franco

Thanks Franco :)

I see 24.7_9 is now out, is this patch rolled into 24.7_9 also?

It's broken somehow. You will need to disable IPS mode in intrusion detection prior to 24.7 upgrade or sit this one out on 24.1.x.

Passing this on to FreeBSD crowd at the moment


Cheers,
Franco

Oh dear, appreciate it's a FreeBSD issue potentially, one of my impacted gateways is a paid subscription in AWS, is there any scope to accelerate any troubleshooting/patching? The IPS feature for this gateway is quite important sadly.

Many thanks.

It was handed off to FreeBSD devs. This is the highest level of acceleration we have.


Cheers,
Franco

No probs, appreciate the feedback, thank you :)


Had to publish a second kernel because the first patch wasn't complete:

https://github.com/opnsense/src/issues/211#issuecomment-2264652883

Feedback is highly appreciated in order to get this into 24.7.1.


Cheers,
Franco